<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "journalpublishing3.dtd">
<article xml:lang="en" article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<?release-delay 0|0?>
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">BR</journal-id>
<journal-title-group>
<journal-title>Biomedical Reports</journal-title>
</journal-title-group>
<issn pub-type="ppub">2049-9434</issn>
<issn pub-type="epub">2049-9442</issn>
<publisher>
<publisher-name>D.A. Spandidos</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">BR-20-3-01733</article-id>
<article-id pub-id-type="doi">10.3892/br.2024.1733</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Articles</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Genes encoding &#x03B3;‑glutamyl‑transpeptidases in the allicin biosynthetic pathway in garlic (<italic>Allium sativum</italic>)</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Baltzi</surname><given-names>Eleni</given-names></name>
<xref rid="af1-BR-20-3-01733" ref-type="aff">1</xref>
<xref rid="af2-BR-20-3-01733" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Papaloukas</surname><given-names>Costas</given-names></name>
<xref rid="af2-BR-20-3-01733" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Spandidos</surname><given-names>Demetrios A.</given-names></name>
<xref rid="af3-BR-20-3-01733" ref-type="aff">3</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Michalopoulos</surname><given-names>Ioannis</given-names></name>
<xref rid="af1-BR-20-3-01733" ref-type="aff">1</xref>
<xref rid="c1-BR-20-3-01733" ref-type="corresp"/>
</contrib>
</contrib-group>
<aff id="af1-BR-20-3-01733"><label>1</label>Centre of Systems Biology, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece</aff>
<aff id="af2-BR-20-3-01733"><label>2</label>Department of Biological Applications and Technology, University of Ioannina, 45110 Ioannina, Greece</aff>
<aff id="af3-BR-20-3-01733"><label>3</label>Laboratory of Clinical Virology, Medical School, University of Crete, 71003 Heraklion, Greece</aff>
<author-notes>
<corresp id="c1-BR-20-3-01733"><italic>Correspondence to:</italic> Dr Ioannis Michalopoulos, Centre of Systems Biology, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, 11527 Athens, Greece <email>imichalop@bioacademy.gr uxksew@163.com </email></corresp>
<fn><p><italic>Abbreviations:</italic> AGAT, another Gtf/Gff analysis toolkit; AsGGT, <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase; BLAST, basic local alignment search tool; CDS, coding sequence; FMO, flavin-dependent S-monooxygenase; GFF, general feature format; HMM, hidden Markov model; IGV, Integrative Genomics Viewer; NCBI, National Center for Biotechnology Information; SAC, S-allyl-L-cysteine; UTR, untranslated region</p></fn>
</author-notes>
<pub-date pub-type="collection">
<month>03</month>
<year>2024</year></pub-date>
<pub-date pub-type="epub">
<day>23</day>
<month>01</month>
<year>2024</year></pub-date>
<volume>20</volume>
<issue>3</issue>
<elocation-id>45</elocation-id>
<history>
<date date-type="received">
<day>16</day>
<month>11</month>
<year>2023</year></date>
<date date-type="accepted">
<day>16</day>
<month>01</month>
<year>2024</year></date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; Baltzi et al.</copyright-statement>
<copyright-year>2023</copyright-year>
<license license-type="open-access">
<license-p>This is an open access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivs License</ext-link>, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.</license-p></license>
</permissions>
<abstract>
<p>Allicin is a thiosulphate molecule produced in garlic (<italic>Allium sativum</italic>) and has a wide range of biological actions and pharmaceutical applications. Its precursor molecule is the non-proteinogenic amino acid alliin (S-allylcysteine sulphoxide). The alliin biosynthetic pathway in garlic involves a group of enzymes, members of which are the &#x03B3;-glutamyl-transpeptidase isoenzymes, <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase AsGGT1, AsGGT2 and AsGGT3, which catalyze the removal of the &#x03B3;-glutamyl group from &#x03B3;-glutamyl-S-allyl-L-cysteine to produce S-allyl-L-cysteine. This removal is followed by an S-oxygenation, which leads to the biosynthesis of alliin. The aim of the present study is to annotate previously discovered genes of garlic &#x03B3;-glutamyl-transpeptidases, as well as a fourth candidate gene (AsGGT4) that has yet not been described. The annotation includes identifying the loci of the genes in the garlic genome, revealing the overall structure and conserved regions of these genes, and elucidating the evolutionary history of these enzymes through their phylogenetic analysis. The genomic structure of &#x03B3;-glutamyl-transpeptidase genes is conserved; each gene consists of seven exons, and these genes are located on different chromosomes. AsGGT3 and AsGGT4 enzymes contain a signal peptide. To that end, the AsGGT3 protein sequence was corrected; four indel events occurring in AsGGT3 coding regions suggested that at least in the garlic variety Ershuizao, AsGGT3 may be a pseudogene. Finally, the use of protein structure prediction tools allowed the visualization of the tertiary structure of the candidate peptide.</p>
</abstract>
<kwd-group>
<kwd>garlic</kwd>
<kwd>allicin</kwd>
<kwd>&#x03B3;-glutamyl-transpeptidase</kwd>
<kwd>genome annotation</kwd>
<kwd>gene discovery</kwd>
</kwd-group>
<funding-group>
<funding-statement><bold>Funding:</bold> No funding was received.</funding-statement>
</funding-group>
</article-meta>
</front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>Traditional homemade remedies, such as garlic, have been used for the treatment of pain, inflammation and cardiovascular disease. Scientific research exploring the medicinal properties of garlic, focuses on allicin, diallyl sulfate and other diallyls (<xref rid="b1-BR-20-3-01733" ref-type="bibr">1</xref>). The <italic>Allium</italic> species, including garlic, contain sulfoxides with unique medicinal properties, including antioxidant (<xref rid="b2-BR-20-3-01733" ref-type="bibr">2</xref>), anti-cancer (<xref rid="b3-BR-20-3-01733" ref-type="bibr">3</xref>), anti-viral, anti-microbial (<xref rid="b4-BR-20-3-01733" ref-type="bibr">4</xref>) and anti-fungal properties (<xref rid="b5-BR-20-3-01733" ref-type="bibr">5</xref>), and have been used in the treatment of diabetes (<xref rid="b6-BR-20-3-01733" ref-type="bibr">6</xref>,<xref rid="b7-BR-20-3-01733" ref-type="bibr">7</xref>) and periodontal disease (<xref rid="b8-BR-20-3-01733" ref-type="bibr">8</xref>,<xref rid="b9-BR-20-3-01733" ref-type="bibr">9</xref>) and for potentially preventing cardiovascular (<xref rid="b10-BR-20-3-01733 b11-BR-20-3-01733 b12-BR-20-3-01733" ref-type="bibr">10-12</xref>) and neurodegenerative diseases (<xref rid="b13-BR-20-3-01733" ref-type="bibr">13</xref>,<xref rid="b14-BR-20-3-01733" ref-type="bibr">14</xref>). This great variety in therapeutic properties is the factor that has motivated such extensive research into garlic.</p>
<p>Allicin (diallyl thiosulfinate) is a prominent study molecule linked to various beneficial properties; for instance, it plays a protective role in cardiovascular diseases (<xref rid="b15-BR-20-3-01733 b16-BR-20-3-01733 b17-BR-20-3-01733" ref-type="bibr">15-17</xref>). Its structure was described in 1948(<xref rid="b18-BR-20-3-01733" ref-type="bibr">18</xref>). Produced upon tissue damage by garlic, allicin is a molecule that contributes to the defense of the plant with a wide range of biological actions. Allicin is almost exclusively responsible for the antimicrobial action of freshly ground garlic (<xref rid="b19-BR-20-3-01733" ref-type="bibr">19</xref>) and it also presents antifungal activity (<xref rid="b20-BR-20-3-01733" ref-type="bibr">20</xref>).</p>
<p>Despite all the interest in allicin, not all enzymes involved in its biosynthetic pathway have been identified. In the final steps of the biosynthetic pathway of allicin, &#x03B3;-glutamyl transpeptidases catalyze the removal of glutamyl from &#x03B3;-glutamyl-S-allyl-L-cysteine to produce S-allyl-L-cysteine (SAC) (<xref rid="b21-BR-20-3-01733" ref-type="bibr">21</xref>), which in turn undergoes an S-oxygenation, catalyzed by the flavin-dependent S-monooxygenase (FMO) enzyme, resulting in the production of alliin (S-allylcysteine sulfoxide) (<xref rid="b22-BR-20-3-01733" ref-type="bibr">22</xref>). This non-proteinogenic amino acid is converted to allicin in a reaction catalyzed by the enzyme alliinase. As a major precursor of allicin, alliin is also crucial to scientific research in order to further explore the biosynthetic pathway of allicin (<xref rid="b23-BR-20-3-01733" ref-type="bibr">23</xref>).</p>
<p>Thus far, three genes &#x005B;<italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase (AsGGT)1, AsGGT2 and AsGGT3&#x005D; encoding &#x03B3;-glutamyl transpeptidases, have been identified in garlic (<xref rid="b24-BR-20-3-01733" ref-type="bibr">24</xref>). Recombinant peptides of AsGGT1, AsGGT2 and AsGGT3 have exhibited notable deglutamylation activity towards alliin&#x0027;s intermediate, &#x03B3;-glutamyl-S-allyl-L-cysteine. These proteins can function as hydrolases without a suitable substrate; however, their activity increases with glycylglycine. The three peptides, AsGGT1, AsGGT2 and AsGGT3, differ in their affinity for the &#x03B3;-glutamyl-S-allyl-L-cysteine substrate. AsGGT1 and AsGGT2 have a high affinity for &#x03B3;-glutamyl-S-allyl-L-cysteine and contribute to alliin biosynthesis in leaves during bulb formation and maturation (<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>). AsGGT3 may contribute to alliin biosynthesis in bulbs upon dormancy termination (<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>). Additionally, AsGGT2 localizes to the vacuole, while AsGGT1 and AsGGT3 lack a signal peptide for intracellular organelles (<xref rid="b24-BR-20-3-01733" ref-type="bibr">24</xref>). These &#x03B3;-glutamyl transferases may contribute differently to alliin biosynthesis in garlic and may act synergistically (<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>).</p>
<p>The size of the garlic nuclear genome is &#x007E;16.9 Gbp, organized into eight chromosomes, and the number of predicted genes thus far is 57,561(<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>). The garlic genome owes this increased quantity primarily to polyploidy caused by whole genome duplication events and transposable element proliferation. The main aim of the present study was to map known genes which code &#x03B3;-glutamyl transpeptidases on the garlic genome and to search for unidentified ones, as it was proposed that differences in the expression of enzymes involved in the biosynthesis of allicin, may be related to variant allicin production levels between garlic cultivars.</p>
</sec>
<sec sec-type="Materials|methods">
<title>Materials and methods</title>
<p>For the comprehensive analysis of all AsGGT genes and gene products, a bioinformatics pipeline was followed (<xref rid="f1-BR-20-3-01733" ref-type="fig">Fig. 1</xref>).</p>
<sec>
<title/>
<sec>
<title>Genome mapping of characterized AsGGT genes</title>
<p>To search for the nucleotide sequences of garlic &#x03B3;-glutamyl transpeptidase transcripts in the National Center for Biotechnology Information (NCBI) GenBank (<xref rid="b26-BR-20-3-01733" ref-type="bibr">26</xref>), the following query was used: &#x2018;gamma-glutamyl transpeptidase&#x2019; AND &#x2018;<italic>Allium sativum</italic>&#x2019;&#x005B;porgn:__txid4682&#x005D;. The three resulting nucleotide sequences were stored in FASTA format. To identify and download their corresponding peptide sequences in FASTA format, the link of each NCBI GenBank entry was followed to its corresponding NCBI Protein (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.ncbi.nlm.nih.gov/protein/">https://www.ncbi.nlm.nih.gov/protein/</ext-link>) entry.</p>
<p>The garlic genome sequence in FASTA format, as well as its corresponding annotation in general feature format (GFF) format (<xref rid="b27-BR-20-3-01733" ref-type="bibr">27</xref>), were downloaded from NCBI Genome (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.ncbi.nlm.nih.gov/genome/">https://www.ncbi.nlm.nih.gov/genome/</ext-link>). The GenBank accession of the genome assembly used in the analysis in the present study was GCA_014155895.2(<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>). Due to the size of the <italic>Allium sativum</italic> genome chromosomes, it was deemed necessary to fragment them, as basic local alignment search tool (BLAST)+ (<xref rid="b28-BR-20-3-01733" ref-type="bibr">28</xref>) that was to be used to search for sequences of interest in the garlic genome cannot handle sequences longer than 1 Gbp. For this reason, a script in PHP programming language was created to split the chromosome sequences into sequences with maximum size 1 Gbp. The preliminary identification of the genomic regions of the three known genes was based on the pairwise alignments between their nucleotide and protein sequences, and those of the genome, as obtained from BLAST+ searches: The boundaries of the AsGGT exons were identified using BLASTN search for the nucleotide sequences. Similarly, TBLASTN searches for the peptide sequences revealed the genomic coordinates of the coding sequence (CDS) of each exon for every AsGGT gene. Exonerate (<xref rid="b29-BR-20-3-01733" ref-type="bibr">29</xref>) was then used for the precise mapping of the nucleotide and protein sequences on the genome. Exonerate determined the exact boundaries for each gene, exon, CDS and 5&#x0027; and 3&#x0027; untranslated regions (UTRs), and produced a GFF file for each gene. Finally, a manual inspection of these boundaries was performed using Integrative Genomics Viewer (IGV, Version: 2.16.1) (<xref rid="b30-BR-20-3-01733" ref-type="bibr">30</xref>). Exonerate and manual inspection ensured that the splice sites of the introns belong to the GU-AG group (<xref rid="b31-BR-20-3-01733" ref-type="bibr">31</xref>).</p>
<p>To compare the manual annotation for the three genes with the annotations produced automatically (<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>), all relevant genomic features from the automatic GFF file were extracted to create the GFF specifically for these genes. The automatically and manually created exon-intron boundaries were visually inspected using IGV and the protein sequences were extracted from the GFF files and the garlic genome with another Gtf/Gff analysis toolkit (AGAT) (<xref rid="b32-BR-20-3-01733" ref-type="bibr">32</xref>). Global pairwise alignments (<xref rid="b33-BR-20-3-01733" ref-type="bibr">33</xref>) between the automatically and manually produced protein sequences were performed.</p>
</sec>
<sec>
<title>Identification of AsGGT-coding exons on chromosomes and scaffolds using hidden Markov models (HMMs)</title>
<p>A multiple sequence alignment of the three protein sequences was performed using Clustal Omega (<xref rid="b34-BR-20-3-01733" ref-type="bibr">34</xref>). It was noted that apart from the first exon CDSs which did not present a high degree of conservation among the three peptides, the boundaries of the CDSs of all other exons matched perfectly. Based on this finding, HMM profiles for the multiple sequence alignments of the six conserved CDSs were built using HMMER (<xref rid="b35-BR-20-3-01733" ref-type="bibr">35</xref>). All characterized <italic>Allium sativum</italic> proteins were collected from NCBI Protein. This search yielded 804 peptide sequences which were stored as a single FASTA file. A search for the HMM of each exon against all characterized <italic>Allium sativum</italic> proteins was performed. This search did not yield any new peptide sequence beyond the three already known proteins; therefore, it was assumed that no other characterized protein was homologous to the known AsGGTs.</p>
<p>With the aid of a PHP script, the translation of the whole <italic>Allium sativum</italic> genome into the six open reading frames (ORFs) was performed as previously described (<xref rid="b36-BR-20-3-01733" ref-type="bibr">36</xref>). With this procedure, all possible peptides, as well as their corresponding GFFs were generated. To search for homologous sequences potentially encoding AsGGT enzymes in addition to those already characterized ones, a PHP script searched the HMM of each exon against all potentially genome-coded peptides. This produced a GFF file per chromosome or scaffold which contained all genomic regions that could code for AsGGT CDSs. To identify chromosomes or scaffolds coding AsGGT genes, a manual check of their corresponding GFF files was performed. The GFF files for chromosome 8 (CM031537.1), chromosome 5 (CM031533.1) and chromosome 4 (CM031532.1) contained the CDSs of the already characterized AsGGT1, AsGGT2 and AsGGT3 genes, respectively. The GFF file for chromosome 6 (CM031534.1) contained consecutive CDSs which corresponded to exons 2-7, suggesting the existence of a near complete gene structure. Mapping the three known protein sequences on chromosome 6 with Exonerate and consequently performing manual inspection with IGV, a GFF file containing genomic features (exons 2-7) of this candidate gene was created.</p>
</sec>
<sec>
<title>Discovering the first exon of the AsGGT-like gene on chromosome 6</title>
<p>The protein sequence of the newly discovered AsGGT-like (AsGGT4) gene was extracted from its GFF and the <italic>Allium sativum</italic> genome sequence, using AGAT. A BLASTP search limited to Liliopsida (monocots) was performed in UniProt (<xref rid="b37-BR-20-3-01733" ref-type="bibr">37</xref>) to examine whether the peptide in question had already been identified and to identify homologous peptide sequences in related species of <italic>Allium sativum</italic>. The peptide had not been previously identified and the UniProt BLASTP search revealed that its amino terminal end was occasionally aligned with the signal peptide of homologous proteins, suggesting that the potential enzyme may contain a signal peptide at its amino terminal end. Thus, SignalP 6.0(<xref rid="b38-BR-20-3-01733" ref-type="bibr">38</xref>) and DeepTMHMM (<xref rid="b39-BR-20-3-01733" ref-type="bibr">39</xref>) were used to check for the existence of a eukaryotic signal peptide in the known AsGGT enzymes, as well as the potential one.</p>
<p>The confirmation of the existence of a signal peptide in the new enzyme allowed for the prediction of the length of the CDS region of its missing first exon: As signal peptides usually have a length of 22-25 amino acids and 19 amino acids are already found in the CDS region of exon 2 in this enzyme, it was expected that the CDS region of the first exon would encode &#x007E;4 amino acids, beginning with a methionine (AUG codon). In addition, the boundaries of the intron between exon 1 and exon 2 of the gene might be of GU-AG type, as explained above. Visually scanning the genomic area upstream exon 2 in IGV, a sequence matching the above criteria was discovered, thus completing the potential gene sequence. GFF features of the genomic region where AsGGT4 gene is located, were extracted from the automatically produced GFF file (<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>). A comparison between the automatically and manually produced annotations as they were depicted in their respective GFFs, was performed by visual inspection in IGV.</p>
</sec>
<sec>
<title>Mapping of RNA-seq reads on AsGGT transcript sequences with BLAST+</title>
<p>To ensure that AsGGT4 is actually transcribed, an alignment of all available <italic>Allium sativum</italic> short RNA-seq reads was performed on its potential transcript, using a PHP script. As a positive control, the same alignment was performed on the AsGGT3 transcript. AsGGT3 was selected as it was found that it was the most similar known AsGGT to AsGGT4. SRA (<xref rid="b40-BR-20-3-01733" ref-type="bibr">40</xref>) was searched for all Illumina RNA-Seq runs of <italic>Allium sativum</italic> samples. SRR IDs of these entries were used for the download of their corresponding fastq.gz files from the European Nucleotide Archive (ENA) (<xref rid="b41-BR-20-3-01733" ref-type="bibr">41</xref>). These files were then unzipped and converted into FASTA files using EMBOSS (<xref rid="b42-BR-20-3-01733" ref-type="bibr">42</xref>). Each FASTA file was converted into a BLAST database and the two transcript sequences were BLASTN-searched against the RNA-Seq read database. The outputs of each search were then manually examined to identify reads which could correspond to the AsGGT3 or AsGGT4 transcripts.</p>
</sec>
<sec>
<title>Prediction of the tertiary structure of the AsGGT peptides</title>
<p>As no solved 3D structures for AsGGT proteins are available, AlphaFold (<xref rid="b43-BR-20-3-01733" ref-type="bibr">43</xref>) was used for structure predictions from primary peptide sequences. The predicted structures of the known AsGGTs were downloaded from the AlphaFold Protein Structure Database (<xref rid="b44-BR-20-3-01733" ref-type="bibr">44</xref>). A prediction of the tertiary structure of the AsGGT4 was performed within the ColabFold website (<xref rid="b45-BR-20-3-01733" ref-type="bibr">45</xref>), which uses a simplified version of AlphaFold v2.3.2. For AsGGT4 structure prediction, homology modelling was also performed using the SWISS-MODEL Workspace (<xref rid="b46-BR-20-3-01733" ref-type="bibr">46</xref>).</p>
</sec>
<sec>
<title>Construction of a phylogenetic tree for the four genes</title>
<p>Since SWISS-MODEL used the solved structure of Bacillus licheniformis &#x03B3;-glutamyl-transpeptidase &#x005B;Protein Data Bank (PDB) ID: 4Y23&#x005D; to predict the structure of the AsGGT4 peptide by homologous modelling, the bacterial peptide sequence was used as a distant evolutionary relative (outgroup) of <italic>Allium sativum</italic> AsGGT peptides. This sequence was downloaded from PDB (<xref rid="b47-BR-20-3-01733" ref-type="bibr">47</xref>) and was then searched using BLASTP in UniProt. Its corresponding sequence in UniProt ID was Q65KZ6_BACLD. MUSCLE (<xref rid="b48-BR-20-3-01733" ref-type="bibr">48</xref>) was used to generate a multiple sequence alignment and a phylogenetic tree of the peptide sequences of the four AsGGTs and their distant relative, <italic>Bacillus licheniformis</italic> &#x03B3;-glutamyl transpeptidase. Visualization of the multiple sequence alignment was performed with JalView (<xref rid="b49-BR-20-3-01733" ref-type="bibr">49</xref>). The phylogenetic tree of the five peptides in Newick format (<xref rid="b50-BR-20-3-01733" ref-type="bibr">50</xref>) was visualized with Dendroscope (<xref rid="b51-BR-20-3-01733" ref-type="bibr">51</xref>).</p>
</sec>
</sec>
</sec>
<sec sec-type="Results">
<title>Results</title>
<sec>
<title/>
<sec>
<title>Mapping of the three already characterized AsGGT genes</title>
<p>The constructed GFF files for the AsGGT1, AsGGT2 and AsGGT3 genes were visualized (<xref rid="f2-BR-20-3-01733" ref-type="fig">Fig. 2A-C</xref>). The analysis revealed that AsGGT1 is located on chromosome 8: 403,215,661-403,234,269, AsGGT2 on chromosome 5: 617,284,828-617,294,697 and AsGGT3 on chromosome 4: 182,866,703-182,871,895. All three genes consist of seven exons of comparable size. Although the size of each corresponding exon is comparable among the three genes, the size of the corresponding introns varies, resulting in a considerable difference in gene size: The AsGGT1 gene size is 18,609 bp, that of AsGGT2 is 9,870 bp and that of AsGGT3 is 5,193 bp.</p>
<p>Through a manual inspection of the genomic features of AsGGT3 on chromosome 4, it became apparent that there are four loci in the genomic sequence where single base deletions or insertions (indel events) occurred (<xref rid="f3-BR-20-3-01733" ref-type="fig">Fig. 3</xref>). An addition of an adenine appears at position 182,867,809 in the second exon, an addition of a guanine appears at position 182,869,779 in the third exon, a deletion of an adenine appears at position 182,870,778 in the fourth exon and an addition of a guanine appears at position 182,870,929 in the seventh exon.</p>
</sec>
<sec>
<title>Mapping of the potential &#x03B3;-glutamyl transpeptidase gene (AsGGT4)</title>
<p>Search for the HMMs of the six coding sequences which correspond to already characterized AsGGT exons in the potential proteome, which resulted from the <italic>in silico</italic> translation of the genome into the six ORFs, in addition to identifying the coding regions of known genes, allowed for the discovery of novel genomic regions that could encode parts of &#x03B3;-glutamyl transpeptidase enzymes.</p>
<p>By filtering the search results and keeping only those containing adjacent regions corresponding to all exons that comprise a potential AsGGT gene, a crude GFF was generated that described the coding regions of a potential fourth &#x03B3;-glutamyl transpeptidase gene (AsGGT4). The complete GFF file for AsGGT4 was constructed by optimizing the primary GFF. AsGGT4 is located on chromosome 6: 92,586,715-92,589,081 and has a size of 2,336 bp. Similar to the three already characterized genes (AsGGT1, AsGGT2 and AsGGT3), the AsGGT4 (<xref rid="f2-BR-20-3-01733" ref-type="fig">Figs. 2D</xref> and <xref rid="f4-BR-20-3-01733" ref-type="fig">4</xref>) gene consists of seven exons. Unlike the already characterized genes, AsGGT4 is transcribed in the reverse direction.</p>
</sec>
<sec>
<title>Signal peptide in AsGGTs</title>
<p>A BLASTP search in UniProt for homologous monocot protein sequences with the potential &#x03B3;-glutamyl-transpeptidase sequence revealed a number of sequences belonging to different species (<xref rid="f5-BR-20-3-01733" ref-type="fig">Fig. 5</xref>). The visual observation of the produced pairwise alignments revealed that a number of homologous peptides possessed a signal peptide. It was therefore examined whether the three known enzymes and the potential one, contain a signal peptide in their amino terminal sequence using SignalP and DeepTMHMM. SignalP did not predict a signal peptide for AsGGT1 or AsGGT2 peptides. DeepTMHMM predicted a single transmembrane helix in 32-47 and 30-46 amino acid regions of AsGGT1 and AsGGT2, respectively. It also predicted that their C-terminus is extracellular. SignalP revealed a potential signal peptide profile, mainly in the 23-46 amino acid region of AsGGT3 (<xref rid="f6-BR-20-3-01733" ref-type="fig">Fig. 6A</xref>), beginning with a methionine at position 23. When the first 22 amino acids were removed from the sequence, SignalP revealed the presence of a signal peptide, where amino acids 1-7 form its amino terminal region, amino acids 8-18 are hydrophobic and finally amino acids 19-24 form its carboxy terminal end (<xref rid="f6-BR-20-3-01733" ref-type="fig">Fig. 6B</xref>). DeepTMHMM also predicted a signal peptide for both the untruncated and truncated AsGGT3 peptide.</p>
<p>SignalP predicted a 19 amino acid long signal peptide in the amino terminal coding region of the second exon (<xref rid="f7-BR-20-3-01733" ref-type="fig">Fig. 7A</xref>), implying that the coding sequence of the first exon of AsGGT4 would code for approximately another four amino acids. By manually inspecting the genomic region upstream of the second exon, a first exon which could code for four amino acids was proposed. SignalP predicted a signal peptide of 23 amino acids in the full length AsGGT4 peptide, where the first four amino acids form its amino terminal region, amino acids 5-18 are hydrophobic, while amino acids 19-23 form its carboxy terminus (<xref rid="f7-BR-20-3-01733" ref-type="fig">Fig. 7B</xref>). DeepTMHMM also predicted a signal peptide.</p>
</sec>
<sec>
<title>Comparison of automatic and manual genome annotation</title>
<p>The coding regions of AsGGT1 and AsGGT2 are identical to the coding regions obtained by automated genomic annotation; i.e., from the use of the GFF file that was provided by the group that performed the genome sequencing (<xref rid="f8-BR-20-3-01733" ref-type="fig">Fig. 8A</xref> and <xref rid="f8-BR-20-3-01733" ref-type="fig">B</xref>); by contrast, automated genomic annotation failed to correctly assign the exon boundaries and coding regions of AsGGT3 (<xref rid="f8-BR-20-3-01733" ref-type="fig">Fig. 8C</xref>). The extracted protein sequence also differed from the one previously reported (<xref rid="b24-BR-20-3-01733" ref-type="bibr">24</xref>).</p>
<p>Pairwise alignments between the protein sequences extracted from manual and automatic annotation revealed that the peptide sequences of the AsGGT1 and AsGGT2 genes from the automatic annotation matched almost perfectly with the sequences obtained experimentally and with the annotation performed in the present study-small differences may be due to point mutations. By contrast, marked differences were observed between the automatic annotation of the peptides of the AsGGT3 and AsGGT4 genes and the annotation performed the present study. In AsGGT3, differences appear at the four points where indel events occur in the genomic region, as expected, but the exon boundaries are generally correctly described beyond these points. By contrast, the automatic annotation of AsGGT4 appears to have failed almost completely. Although the exon boundaries are correctly predicted (<xref rid="f8-BR-20-3-01733" ref-type="fig">Fig. 8D</xref>), the ORF is incorrect in all exons, and the first exon is completely missing-the start of the code region is incorrectly located in the second exon. Thus, pairwise alignment does not detect any similarity-apart from a minute region that is probably accidental, between the two extracted peptide sequences (<xref rid="f9-BR-20-3-01733" ref-type="fig">Fig. 9</xref>). By changing the phase from 2 to 1 in the automatically produced GFF, the resulting peptide sequence is the correct one.</p>
</sec>
<sec>
<title>Phylogenetic analysis</title>
<p>A multiple sequence alignment (<xref rid="f10-BR-20-3-01733" ref-type="fig">Fig. 10</xref>) of the four garlic AsGGT peptide sequences and the Bacillus licheniformis &#x03B3;-glutamyl transpeptidase sequence was performed and the radial phylogram of the five peptides was constructed (<xref rid="f11-BR-20-3-01733" ref-type="fig">Fig. 11</xref>). In the radial phylogram, the root of the AsGGTs subtree is the point at which the outgroup (the <italic>Bacillus licheniformis</italic> peptide) connects to the subtree of the AsGGT peptides. The root, which represents the common ancestor of the four AsGGTs, appears to have given rise to two ancestral peptides, which subsequently gave rise to the peptides AsGGT1 and AsGGT2, and the peptides AsGGT3 and AsGGT4.</p>
</sec>
<sec>
<title>Prediction of the tertiary structure of the AsGGTs</title>
<p>The predicted structures of all AsGGTs are relatively similar and consist of one &#x03B2;-sandwich which is surrounded by clusters of interacting &#x03B1;-helices. The two cartoon-like visualizations of AsGGT4 produced by SWISS-MODEL (<xref rid="f12-BR-20-3-01733" ref-type="fig">Fig. 12</xref> and AlphaFold (<xref rid="f13-BR-20-3-01733" ref-type="fig">Fig. 13</xref>) (the color spectrum in both visualizations reflects the confidence of prediction) were superimposed (<xref rid="f14-BR-20-3-01733" ref-type="fig">Fig. 14</xref>) with PyMOL software (<xref rid="b52-BR-20-3-01733" ref-type="bibr">52</xref>). This revealed that predicted structures by both algorithms are very similar. Most importantly, the catalytic T372 remains at the same position in the active site in both predictions. The position of a flexible loop relative to the active site appears displaced between the two structures, with low prediction confidence in both structures. The main difference between the two predictions is that an unstructured sequence which corresponds to the predicted signal peptide, only appears in the AlphaFold structure, with low prediction confidence. As with AsGGT4, AlphaFold also fails to predict any secondary structure for the predicted signal peptide of AsGGT3. Notably, AlphaFold predicts a helical structure for the DeepTMHHM-predicted transmembrane helix of AsGGT1, but not for the same feature in AsGGT2.</p>
</sec>
<sec>
<title>Detection of AsGGT4 expression</title>
<p>A total of 366 paired-end and five single-end Illumina RNA-Seq runs were identified (until March 24, 2023) in garlic. In the vast majority of these, reads corresponding to AsGGT3 transcripts were identified. On the other hand, a small proportion of runs contained AsGGT4 reads. The majority of these were identified on the SRR13219906 run, which corresponded to a leaf tissue replicate of garlic grown under normal soil moisture conditions. Other runs containing reads which corresponded to AsGGT4 transcripts included those of the PRJNA566287 and PRJNA489986 BioProjects. Their runs derive from long-vernalization, short-vernalization and non-vernalized stored clove and leaf and clove samples, respectively.</p>
</sec>
</sec>
</sec>
<sec sec-type="Discussion">
<title>Discussion</title>
<p>To map known genes and discover new ones, to characterize fully their genomic structure, to predict the tertiary structure of the proteins they encode and to discover their evolutionary history, an integrated workflow based on a number of cutting-edge bioinformatics tools was developed.</p>
<p>All four AsGGT genes consist of seven exons, and there appears to be a conservation in the overall structure of these exons. All introns of all four genes belong to the GT-AG splice site group. There is, however, a large discrepancy in length between the four genes, due to the large difference in size of their introns.</p>
<p>Frameshift mutations are base insertions or deletions (indels) within the coding region of a gene that disrupt the reading frame in such a manner that the entire set of triplets following the mutation site is altered. Often a termination codon forms within the coding sequence, leading to premature termination of mRNA translation and hence to a shorter non-functional peptide (<xref rid="b53-BR-20-3-01733" ref-type="bibr">53</xref>). In the case of the AsGGT3 gene, four indel events were observed during the alignment of the transcript and its protein sequence onto the genome. A possible explanation, at least for three of the four incomplete alignments, is that sequencers may be mistaken in their estimation of the repeat length of the same nucleotide (<xref rid="b54-BR-20-3-01733" ref-type="bibr">54</xref>). Therefore, in the first insertion, the sequencer estimated that there were two adenines when there may have been one. Similarly, in the last insertion, it considered that there were three guanines, whereas there were probably two. Finally, in the deletion, it predicted the presence of one adenine, whereas there were probably two. These potential errors in the sequencing and assembly of the genomic sequence disoriented automatic annotation and made manual mapping of this gene particularly challenging.</p>
<p>If the observed single-nucleotide indel events are not errors during sequencing, then they lead to successive frameshifts of the ORF. The presence of only one of the four frameshift mutations occurring in the AsGGT3 coding region is sufficient to render the gene inactive, let alone the accumulation of all four. Therefore, if these mutations are indeed present, AsGGT3 gene is in fact a pseudogene, at least in the garlic cultivar Ershuizao used in the genome sequencing (<xref rid="b25-BR-20-3-01733" ref-type="bibr">25</xref>). By contrast, in the garlic cultivar Fukuchi-howaito, which was used to discover and characterize the three AsGGT genes (<xref rid="b24-BR-20-3-01733" ref-type="bibr">24</xref>), AsGGT3 still appears functional. The existence of GGT pseudogenes in garlic cannot be excluded, as a GGT-homolog in humans, GGT2, does not encode for a functional enzyme (<xref rid="b55-BR-20-3-01733" ref-type="bibr">55</xref>).</p>
<p>This accumulation of mutations in the AsGGT3 gene may be due to garlic domestication. During the domestication of an organism, the selection is based on specific traits. In garlic, this process has led, among others, to the loss of native reproduction, so that garlic is now exclusively clonal (<xref rid="b56-BR-20-3-01733" ref-type="bibr">56</xref>). If a strain with inactivated AsGGT3 had been selected, all offspring would have inherited it. Once the first mutation that leads to the inactivation of the gene has occurred, an accumulation of new mutations can occur, as there is no longer any natural selection pressure to prevent mutations. This inactivation of AsGGT3 may possibly partially explain the observed differences in the amount of allicin produced by different garlic cultivars (<xref rid="b57-BR-20-3-01733" ref-type="bibr">57</xref>). When constructing the GFF file for AsGGT3, it was difficult to represent the ORF shift in a manner that would be recognized by visualization software. In the GFF3 format, the gap representation is represented by the &#x2018;Gap&#x2019; attribute. The &#x2018;Gap&#x2019; feature representation consists of a series of pairs (mode and length) separated by a space, for example &#x2018;M8 D3 M6&#x2019;, where each mode is represented by a code (<xref rid="b27-BR-20-3-01733" ref-type="bibr">27</xref>). IGV does not support mapping of frameshifts, as its developers consider that there is no critical mass demanding this feature (determined by personal communication).</p>
<p>The majority of proteins intended for transport to subcellular compartments have a signal peptide at their amino terminal end. The newly synthesized precursor proteins are localized and transported across the membrane through a multitude of possible pathways (<xref rid="b58-BR-20-3-01733 b59-BR-20-3-01733 b60-BR-20-3-01733" ref-type="bibr">58-60</xref>). The signal peptide is responsible for transporting the remaining polypeptide across the membrane. After proteins have crossed the membrane or during transport, their signal peptides must be removed to activate the mature proteins when they reach their destination (<xref rid="b60-BR-20-3-01733" ref-type="bibr">60</xref>).</p>
<p>Signal peptides consist of three distinct regions: The amino-terminal end, the hydrophobic core and the carboxy terminal end. The hydrophobic core constitutes the largest part of the signal peptide and contains 10-15 amino acid residues. The part of the hydrophobic residues appears to adopt an &#x03B1;-helical configuration across the plasma membrane (<xref rid="b61-BR-20-3-01733" ref-type="bibr">61</xref>).</p>
<p>In <italic>Escherichia coli</italic> (<italic>E. coli</italic>), the sequence of the &#x03B3;-glutamyl-transpeptidase gene contains a single ORF, encoding a signal peptide at the amino-terminal end and some large and small functional regions. The <italic>E. coli</italic> &#x03B3;-glutamyl-transpeptidase signal peptide, consisting of 25 amino acids, is cleaved and the mature GGT localizes to the periplasmic space without anchoring to the membrane (<xref rid="b62-BR-20-3-01733" ref-type="bibr">62</xref>). By contrast, in mammalian cells, GGT is located on the outside of the plasma membrane with the amino terminal end of the large functional domain anchored to the cell membrane (<xref rid="b63-BR-20-3-01733" ref-type="bibr">63</xref>). Of note, of the four AsGGT isoenzymes, only AsGGT3 and AsGGT4 have a signal peptide, suggesting that they probably do not perform exactly the same role as AsGGT1 and AsGGT2.</p>
<p>SWISS-MODEL failed to predict any structure of the AsGGT4 signal peptide, whereas AlphaFold did not predict any secondary structure for the signal peptide. The signal peptides contained in the enzymes encoding the AsGGT3 and AsGGT4 genes are expected to perform a role similar to that of the signal peptides of <italic>E. coli</italic> &#x03B3;-glutamyl transpeptidase or mammalian cells. They may signal the need to transport the newly synthesized protein across membranes. The prediction of a single transmembrane helix close to the N-terminus of AsGGT1 and AsGGT2 suggests that their location is transmembrane and that their C-termini are extracellular. Hints for the subcellular location of a protein can be provided if the subcellular compartment of its homologs in other species is characterized: According to the Human Protein Atlas (<xref rid="b64-BR-20-3-01733" ref-type="bibr">64</xref>), human GGT1 may be localized to the vesicles, GGT5 to nucleoli fibrillar center and GGT7 to the vesicles and the nucleoplasm. All three human homologs are predicted by DeepTMHMM to contain a single transmembrane helix, like the one predicted for garlic GGT1 and GGT2.</p>
<p>During automated genome annotation in eukaryotes, it is assumed, marginally arbitrarily, that the first methionine codon of a transcript is the start codon of its coding region (CDS). This can lead to an incorrect annotation, since translation may be initiated by a subsequent methionine codon. Presumably such an error occurred in the case of the automatic genomic annotation of AsGGT3, which resulted in the presence of a signal peptide being ignored (<xref rid="b24-BR-20-3-01733" ref-type="bibr">24</xref>). A similar case to that of AsGGT3 was observed in glutathione hydrolase of <italic>Oryza meyeriana var. granulata</italic> (UniProt: A0A6G1BRJ1_9ORYZ). As in the case of AsGGT3, the first amino acids of the rice glutathione hydrolase are probably not part of the protein and the peptide begins from a later methionine. Thus, the AsGGT3 sequence was modified, deleting the first 28 amino acids.</p>
<p>The fact that the pHMMs constructed from the garlic GGT peptides identify even bacterial &#x03B3;-glutamyl transpeptidase sequences and that AsGGT4 homology modelling was based on bacterial solved GGT structure means that &#x03B3;-glutamyl-transpeptidases have appeared in evolutionary history before the prokaryote-eukaryote split over three and a half billion years ago and they remain conserved ever since. Such conserved genes confer properties necessary for the survival and adaptation of organisms (<xref rid="b65-BR-20-3-01733" ref-type="bibr">65</xref>). Generally, in bacteria, yeasts, plants and mammals, GGT is a heteromeric protein, consisting of large and small subunits, both produced from a common inactive polypeptide precursor through a process of autocatalysis (<xref rid="b66-BR-20-3-01733 b67-BR-20-3-01733 b68-BR-20-3-01733" ref-type="bibr">66-68</xref>). Some plants, such as tomato, onion and radish, are considered to possess GGT enzymes composed of a single polypeptide, although their sequences remain unknown (<xref rid="b69-BR-20-3-01733" ref-type="bibr">69</xref>,<xref rid="b70-BR-20-3-01733" ref-type="bibr">70</xref>). The amino acid sequences of AsGGT1, AsGGT2 and AsGGT3 possess a conserved threonine residue required for autocatalysis and amino acid residues essential for GGT activity, which have been identified by biochemical analyses in human and <italic>E. coli</italic> (<xref rid="b71-BR-20-3-01733" ref-type="bibr">71</xref>,<xref rid="b72-BR-20-3-01733" ref-type="bibr">72</xref>). Plant GGTs are classified in the same evolutionary clade, which is further divided into two distinct subclades. AsGGT1 and AsGGT2 belong to the subclade containing Arabidopsis thaliana AtGGT4, which is involved in the degradation of S-glutathione in the vacuole (<xref rid="b73-BR-20-3-01733 b74-BR-20-3-01733 b75-BR-20-3-01733" ref-type="bibr">73-75</xref>), whereas AsGGT3 belongs to the subclade containing <italic>Arabidopsis thaliana</italic> AtGGT1 and AtGGT2, which are involved in the degradation of extracellular glutathione (<xref rid="b74-BR-20-3-01733 b75-BR-20-3-01733 b76-BR-20-3-01733" ref-type="bibr">74-76</xref>) along with onion AcGGT (<xref rid="b77-BR-20-3-01733" ref-type="bibr">77</xref>). In the present study, the phylogenetic analysis revealed that AsGGT1 (BAQ21911.1) and AsGGT2 (BAQ21912.1) peptides have a common ancestor which probably lacked a signal peptide and that AsGGT3 (BAQ21913.1) shares a common ancestor with AsGGT4 which probably contained a signal peptide.</p>
<p>The transcriptomic analysis displayed that while AsGGT3 is ubiquitously expressed, AsGGT4 expression is probably tissue-, condition- and/or developmental stage-specific. Thus, it comes as no surprise that the first attempt for the identification of garlic GGT genes (<xref rid="b24-BR-20-3-01733" ref-type="bibr">24</xref>) prior to the knowledge of the genomic sequence, was able to identify AsGGT1, AsGGT2 and AsGGT3, but failed to identify AsGGT4. Therefore, it is possible that AsGGT4 plays a different role than that of the other AsGGTs, which has yet to be discovered.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgements</title>
<p>Not applicable.</p>
</ack>
<sec sec-type="data-availability">
<title>Availability of data and materials</title>
<p>The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.</p>
</sec>
<sec>
<title>Authors&#x0027; contributions</title>
<p>IM, CP and DAS conceived the study. The study methodology was proposed by IM. Data were retrieved, curated and analyzed by EB and IM. Software was developed by IM. Visualization was performed by EB. IM and CP supervised the study. The manuscript was written by EB and IM. EB and IM confirm the authenticity of all the raw data. All authors contributed to the revision of the work, and have read and approved the final manuscript.</p>
</sec>
<sec>
<title>Ethics approval and consent to participate</title>
<p>Not applicable.</p>
</sec>
<sec>
<title>Patient consent for publication</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Competing interests</title>
<p>DAS is the Editor-in-Chief for the journal, but had no personal involvement in the reviewing process, or any influence in terms of adjudicating on the final decision, for this article. The other authors declare that they have no competing interests.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="b1-BR-20-3-01733"><label>1</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tesfaye</surname><given-names>A</given-names></name></person-group><article-title>Revealing the therapeutic uses of garlic (<italic>Allium sativum</italic>) and its potential for drug discovery</article-title><source>Sci World J</source><volume>2021</volume><fpage>1</fpage><lpage>7</lpage><year>2021</year><pub-id pub-id-type="pmid">35002548</pub-id><pub-id pub-id-type="doi">10.1155/2021/8817288</pub-id></element-citation></ref>
<ref id="b2-BR-20-3-01733"><label>2</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tsuneyoshi</surname><given-names>T</given-names></name></person-group><article-title>BACH1 mediates the antioxidant properties of aged garlic extract</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1500</fpage><lpage>1503</lpage><year>2020</year><pub-id pub-id-type="pmid">32010329</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8380</pub-id></element-citation></ref>
<ref id="b3-BR-20-3-01733"><label>3</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kanamori</surname><given-names>Y</given-names></name><name><surname>Via</surname><given-names>LD</given-names></name><name><surname>Macone</surname><given-names>A</given-names></name><name><surname>Canettieri</surname><given-names>G</given-names></name><name><surname>Greco</surname><given-names>A</given-names></name><name><surname>Toninello</surname><given-names>A</given-names></name><name><surname>Agostinelli</surname><given-names>E</given-names></name></person-group><article-title>Aged garlic extract and its constituent, S-allyl-L-cysteine, induce the apoptosis of neuroblastoma cancer cells due to mitochondrial membrane depolarization</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1511</fpage><lpage>1521</lpage><year>2020</year><pub-id pub-id-type="pmid">32010332</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8383</pub-id></element-citation></ref>
<ref id="b4-BR-20-3-01733"><label>4</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nakamoto</surname><given-names>M</given-names></name><name><surname>Kunimura</surname><given-names>K</given-names></name><name><surname>Suzuki</surname><given-names>JI</given-names></name><name><surname>Kodera</surname><given-names>Y</given-names></name></person-group><article-title>Antimicrobial properties of hydrophobic compounds in garlic: Allicin, vinyldithiin, ajoene and diallyl polysulfides</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1550</fpage><lpage>1553</lpage><year>2020</year><pub-id pub-id-type="pmid">32010337</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8388</pub-id></element-citation></ref>
<ref id="b5-BR-20-3-01733"><label>5</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Khounganian</surname><given-names>RM</given-names></name><name><surname>Alwakeel</surname><given-names>A</given-names></name><name><surname>Albadah</surname><given-names>A</given-names></name><name><surname>Nakshabandi</surname><given-names>A</given-names></name><name><surname>Alharbi</surname><given-names>S</given-names></name><name><surname>Almslam</surname><given-names>AS</given-names></name></person-group><article-title>The antifungal efficacy of pure garlic, onion, and lemon extracts against <italic>Candida albicans</italic></article-title><source>Cureus</source><volume>15</volume><issue>e38637</issue><year>2023</year><pub-id pub-id-type="pmid">37284395</pub-id><pub-id pub-id-type="doi">10.7759/cureus.38637</pub-id></element-citation></ref>
<ref id="b6-BR-20-3-01733"><label>6</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hutchins</surname><given-names>E</given-names></name><name><surname>Shaikh</surname><given-names>K</given-names></name><name><surname>Kinninger</surname><given-names>A</given-names></name><name><surname>Cherukuri</surname><given-names>L</given-names></name><name><surname>Birudaraju</surname><given-names>D</given-names></name><name><surname>Mao</surname><given-names>SS</given-names></name><name><surname>Nakanishi</surname><given-names>R</given-names></name><name><surname>Almeida</surname><given-names>S</given-names></name><name><surname>Jayawardena</surname><given-names>E</given-names></name><name><surname>Shekar</surname><given-names>C</given-names></name><etal/></person-group><article-title>Aged garlic extract reduces left ventricular myocardial mass in patients with diabetes: A prospective randomized controlled double-blind study</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1468</fpage><lpage>1471</lpage><year>2020</year><pub-id pub-id-type="pmid">32010324</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8373</pub-id></element-citation></ref>
<ref id="b7-BR-20-3-01733"><label>7</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shaikh</surname><given-names>K</given-names></name><name><surname>Kinninger</surname><given-names>A</given-names></name><name><surname>Cherukuri</surname><given-names>L</given-names></name><name><surname>Birudaraju</surname><given-names>D</given-names></name><name><surname>Nakanishi</surname><given-names>R</given-names></name><name><surname>Almeida</surname><given-names>S</given-names></name><name><surname>Jayawardena</surname><given-names>E</given-names></name><name><surname>Shekar</surname><given-names>C</given-names></name><name><surname>Flores</surname><given-names>F</given-names></name><name><surname>Hamal</surname><given-names>S</given-names></name><etal/></person-group><article-title>Aged garlic extract reduces low attenuation plaque in coronary arteries of patients with diabetes: A randomized, double-blind, placebo-controlled study</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1457</fpage><lpage>1461</lpage><year>2020</year><pub-id pub-id-type="pmid">32010322</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8371</pub-id></element-citation></ref>
<ref id="b8-BR-20-3-01733"><label>8</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ohtani</surname><given-names>M</given-names></name><name><surname>Nishimura</surname><given-names>T</given-names></name></person-group><article-title>The preventive and therapeutic application of garlic and other plant ingredients in the treatment of periodontal diseases</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1507</fpage><lpage>1510</lpage><year>2020</year><pub-id pub-id-type="pmid">32010331</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8382</pub-id></element-citation></ref>
<ref id="b9-BR-20-3-01733"><label>9</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mann</surname><given-names>J</given-names></name><name><surname>Bernstein</surname><given-names>Y</given-names></name><name><surname>Findler</surname><given-names>M</given-names></name></person-group><article-title>Periodontal disease and its prevention, by traditional and new avenues</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1504</fpage><lpage>1506</lpage><year>2020</year><pub-id pub-id-type="pmid">32010330</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8381</pub-id></element-citation></ref>
<ref id="b10-BR-20-3-01733"><label>10</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gruenwald</surname><given-names>J</given-names></name><name><surname>Bongartz</surname><given-names>U</given-names></name><name><surname>Bothe</surname><given-names>G</given-names></name><name><surname>Uebelhack</surname><given-names>R</given-names></name></person-group><article-title>Effects of aged garlic extract on arterial elasticity in a placebo-controlled clinical trial using EndoPAT<sup>&#x2122;</sup> technology</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1490</fpage><lpage>1499</lpage><year>2020</year><pub-id pub-id-type="pmid">32010328</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8378</pub-id></element-citation></ref>
<ref id="b11-BR-20-3-01733"><label>11</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Matsutomo</surname><given-names>T</given-names></name></person-group><article-title>Potential benefits of garlic and other dietary supplements for the management of hypertension</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1479</fpage><lpage>1484</lpage><year>2020</year><pub-id pub-id-type="pmid">32010326</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8375</pub-id></element-citation></ref>
<ref id="b12-BR-20-3-01733"><label>12</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ried</surname><given-names>K</given-names></name></person-group><article-title>Garlic lowers blood pressure in hypertensive subjects, improves arterial stiffness and gut microbiota: A review and meta-analysis</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1472</fpage><lpage>1478</lpage><year>2020</year><pub-id pub-id-type="pmid">32010325</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8374</pub-id></element-citation></ref>
<ref id="b13-BR-20-3-01733"><label>13</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kosuge</surname><given-names>Y</given-names></name></person-group><article-title>Neuroprotective mechanisms of S-allyl-L-cysteine in neurological disease</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1565</fpage><lpage>1569</lpage><year>2020</year><pub-id pub-id-type="pmid">32010340</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8391</pub-id></element-citation></ref>
<ref id="b14-BR-20-3-01733"><label>14</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sripanidkulchai</surname><given-names>B</given-names></name></person-group><article-title>Benefits of aged garlic extract on Alzheimer&#x0027;s disease: Possible mechanisms of action</article-title><source>Exp Ther Med</source><volume>19</volume><fpage>1560</fpage><lpage>1564</lpage><year>2020</year><pub-id pub-id-type="pmid">32010339</pub-id><pub-id pub-id-type="doi">10.3892/etm.2019.8390</pub-id></element-citation></ref>
<ref id="b15-BR-20-3-01733"><label>15</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rahman</surname><given-names>K</given-names></name><name><surname>Lowe</surname><given-names>GM</given-names></name></person-group><article-title>Garlic and cardiovascular disease: A critical review</article-title><source>J Nutr</source><volume>136 (Suppl 3)</volume><fpage>736S</fpage><lpage>740S</lpage><year>2006</year><pub-id pub-id-type="pmid">16484553</pub-id><pub-id pub-id-type="doi">10.1093/jn/136.3.736S</pub-id></element-citation></ref>
<ref id="b16-BR-20-3-01733"><label>16</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gruhlke</surname><given-names>MCH</given-names></name><name><surname>Nicco</surname><given-names>C</given-names></name><name><surname>Batteux</surname><given-names>F</given-names></name><name><surname>Slusarenko</surname><given-names>AJ</given-names></name></person-group><article-title>The effects of allicin, a reactive sulfur species from garlic, on a selection of mammalian cell lines</article-title><source>Antioxidants (Basel)</source><volume>6</volume><issue>1</issue><year>2016</year><pub-id pub-id-type="pmid">28035949</pub-id><pub-id pub-id-type="doi">10.3390/antiox6010001</pub-id></element-citation></ref>
<ref id="b17-BR-20-3-01733"><label>17</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kita</surname><given-names>T</given-names></name><name><surname>Kume</surname><given-names>N</given-names></name><name><surname>Minami</surname><given-names>M</given-names></name><name><surname>Hayashida</surname><given-names>K</given-names></name><name><surname>Murayama</surname><given-names>T</given-names></name><name><surname>Sano</surname><given-names>H</given-names></name><name><surname>Moriwaki</surname><given-names>H</given-names></name><name><surname>Kataoka</surname><given-names>H</given-names></name><name><surname>Nishi</surname><given-names>E</given-names></name><name><surname>Horiuchi</surname><given-names>H</given-names></name><etal/></person-group><article-title>Role of oxidized LDL in atherosclerosis</article-title><source>Ann N Y Acad Sci</source><volume>947</volume><fpage>199</fpage><lpage>206</lpage><year>2001</year><pub-id pub-id-type="pmid">11795267</pub-id><pub-id pub-id-type="doi">10.1111/j.1749-6632.2001.tb03941.x</pub-id></element-citation></ref>
<ref id="b18-BR-20-3-01733"><label>18</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stoll</surname><given-names>A</given-names></name><name><surname>Seebeck</surname><given-names>E</given-names></name></person-group><article-title>About alliin, the genuine mother substance of garlic oil</article-title><source>Helv Chim Acta</source><volume>31</volume><fpage>189</fpage><lpage>210</lpage><year>1948</year><pub-id pub-id-type="pmid">20295196</pub-id><pub-id pub-id-type="doi">10.1007/BF02137698</pub-id><comment>(In German)</comment></element-citation></ref>
<ref id="b19-BR-20-3-01733"><label>19</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cavallito</surname><given-names>CJ</given-names></name><name><surname>Bailey</surname><given-names>JH</given-names></name></person-group><article-title>Allicin, the antibacterial principle of <italic>Allium sativum</italic> I. isolation, physical properties and antibacterial action</article-title><source>J Am Chem Soc</source><volume>66</volume><fpage>1950</fpage><lpage>1951</lpage><year>1944</year></element-citation></ref>
<ref id="b20-BR-20-3-01733"><label>20</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname><given-names>YS</given-names></name><name><surname>Kim</surname><given-names>KS</given-names></name><name><surname>Han</surname><given-names>I</given-names></name><name><surname>Kim</surname><given-names>MH</given-names></name><name><surname>Jung</surname><given-names>MH</given-names></name><name><surname>Park</surname><given-names>HK</given-names></name></person-group><article-title>Quantitative and qualitative analysis of the antifungal activity of allicin alone and in combination with antifungal drugs</article-title><source>PLoS One</source><volume>7</volume><issue>e38242</issue><year>2012</year><pub-id pub-id-type="pmid">22679493</pub-id><pub-id pub-id-type="doi">10.1371/journal.pone.0038242</pub-id></element-citation></ref>
<ref id="b21-BR-20-3-01733"><label>21</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yoshimoto</surname><given-names>N</given-names></name><name><surname>Saito</surname><given-names>K</given-names></name></person-group><article-title>S-Alk(en)ylcysteine sulfoxides in the genus <italic>Allium</italic>: Proposed biosynthesis, chemical conversion, and bioactivities</article-title><source>J Exp Bot</source><volume>70</volume><fpage>4123</fpage><lpage>4137</lpage><year>2019</year><pub-id pub-id-type="pmid">31106832</pub-id><pub-id pub-id-type="doi">10.1093/jxb/erz243</pub-id></element-citation></ref>
<ref id="b22-BR-20-3-01733"><label>22</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Valentino</surname><given-names>H</given-names></name><name><surname>Campbell</surname><given-names>AC</given-names></name><name><surname>Schuermann</surname><given-names>JP</given-names></name><name><surname>Sultana</surname><given-names>N</given-names></name><name><surname>Nam</surname><given-names>HG</given-names></name><name><surname>LeBlanc</surname><given-names>S</given-names></name><name><surname>Tanner</surname><given-names>JJ</given-names></name><name><surname>Sobrado</surname><given-names>P</given-names></name></person-group><article-title>Structure and function of a flavin-dependent S-monooxygenase from garlic (<italic>Allium sativum</italic>)</article-title><source>J Biol Chem</source><volume>295</volume><fpage>11042</fpage><lpage>11055</lpage><year>2020</year><pub-id pub-id-type="pmid">32527723</pub-id><pub-id pub-id-type="doi">10.1074/jbc.RA120.014484</pub-id></element-citation></ref>
<ref id="b23-BR-20-3-01733"><label>23</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Borlinghaus</surname><given-names>J</given-names></name><name><surname>Albrecht</surname><given-names>F</given-names></name><name><surname>Gruhlke</surname><given-names>MC</given-names></name><name><surname>Nwachukwu</surname><given-names>ID</given-names></name><name><surname>Slusarenko</surname><given-names>AJ</given-names></name></person-group><article-title>Allicin: Chemistry and biological properties</article-title><source>Molecules</source><volume>19</volume><fpage>12591</fpage><lpage>12618</lpage><year>2014</year><pub-id pub-id-type="pmid">25153873</pub-id><pub-id pub-id-type="doi">10.3390/molecules190812591</pub-id></element-citation></ref>
<ref id="b24-BR-20-3-01733"><label>24</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yoshimoto</surname><given-names>N</given-names></name><name><surname>Yabe</surname><given-names>A</given-names></name><name><surname>Sugino</surname><given-names>Y</given-names></name><name><surname>Murakami</surname><given-names>S</given-names></name><name><surname>Sai-Ngam</surname><given-names>N</given-names></name><name><surname>Sumi</surname><given-names>S</given-names></name><name><surname>Tsuneyoshi</surname><given-names>T</given-names></name><name><surname>Saito</surname><given-names>K</given-names></name></person-group><article-title>Garlic &#x03B3;-glutamyl transpeptidases that catalyze deglutamylation of biosynthetic intermediate of alliin</article-title><source>Front Plant Sci</source><volume>5</volume><issue>758</issue><year>2014</year><pub-id pub-id-type="pmid">25620969</pub-id><pub-id pub-id-type="doi">10.3389/fpls.2014.00758</pub-id></element-citation></ref>
<ref id="b25-BR-20-3-01733"><label>25</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname><given-names>X</given-names></name><name><surname>Zhu</surname><given-names>S</given-names></name><name><surname>Li</surname><given-names>N</given-names></name><name><surname>Cheng</surname><given-names>Y</given-names></name><name><surname>Zhao</surname><given-names>J</given-names></name><name><surname>Qiao</surname><given-names>X</given-names></name><name><surname>Lu</surname><given-names>L</given-names></name><name><surname>Liu</surname><given-names>S</given-names></name><name><surname>Wang</surname><given-names>Y</given-names></name><name><surname>Liu</surname><given-names>C</given-names></name><etal/></person-group><article-title>A chromosome-level genome assembly of garlic (<italic>Allium sativum</italic>) provides insights into genome evolution and allicin biosynthesis</article-title><source>Mol Plant</source><volume>13</volume><fpage>1328</fpage><lpage>1339</lpage><year>2020</year><pub-id pub-id-type="pmid">32730994</pub-id><pub-id pub-id-type="doi">10.1016/j.molp.2020.07.019</pub-id></element-citation></ref>
<ref id="b26-BR-20-3-01733"><label>26</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sayers</surname><given-names>EW</given-names></name><name><surname>Cavanaugh</surname><given-names>M</given-names></name><name><surname>Clark</surname><given-names>K</given-names></name><name><surname>Pruitt</surname><given-names>KD</given-names></name><name><surname>Sherry</surname><given-names>ST</given-names></name><name><surname>Yankie</surname><given-names>L</given-names></name><name><surname>Karsch-Mizrachi</surname><given-names>I</given-names></name></person-group><article-title>GenBank 2023 update</article-title><source>Nucleic Acids Res</source><volume>51 (D1)</volume><fpage>D141</fpage><lpage>D144</lpage><year>2023</year><pub-id pub-id-type="pmid">36350640</pub-id><pub-id pub-id-type="doi">10.1093/nar/gkac1012</pub-id></element-citation></ref>
<ref id="b27-BR-20-3-01733"><label>27</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stein</surname><given-names>L</given-names></name></person-group><comment>Generic feature format version 3 (GFF3). GitHub, 2020.</comment></element-citation></ref>
<ref id="b28-BR-20-3-01733"><label>28</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Camacho</surname><given-names>C</given-names></name><name><surname>Coulouris</surname><given-names>G</given-names></name><name><surname>Avagyan</surname><given-names>V</given-names></name><name><surname>Ma</surname><given-names>N</given-names></name><name><surname>Papadopoulos</surname><given-names>J</given-names></name><name><surname>Bealer</surname><given-names>K</given-names></name><name><surname>Madden</surname><given-names>TL</given-names></name></person-group><article-title>BLAST+: architecture and applications</article-title><source>BMC Bioinformatics</source><volume>10</volume><issue>421</issue><year>2009</year><pub-id pub-id-type="pmid">20003500</pub-id><pub-id pub-id-type="doi">10.1186/1471-2105-10-421</pub-id></element-citation></ref>
<ref id="b29-BR-20-3-01733"><label>29</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Slater</surname><given-names>GS</given-names></name><name><surname>Birney</surname><given-names>E</given-names></name></person-group><article-title>Automated generation of heuristics for biological sequence comparison</article-title><source>BMC Bioinformatics</source><volume>6</volume><issue>31</issue><year>2005</year><pub-id pub-id-type="pmid">15713233</pub-id><pub-id pub-id-type="doi">10.1186/1471-2105-6-31</pub-id></element-citation></ref>
<ref id="b30-BR-20-3-01733"><label>30</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Thorvaldsd&#x00F3;ttir</surname><given-names>H</given-names></name><name><surname>Robinson</surname><given-names>JT</given-names></name><name><surname>Mesirov</surname><given-names>JP</given-names></name></person-group><article-title>Integrative genomics viewer (IGV): High-performance genomics data visualization and exploration</article-title><source>Brief Bioinform</source><volume>14</volume><fpage>178</fpage><lpage>192</lpage><year>2013</year><pub-id pub-id-type="pmid">22517427</pub-id><pub-id pub-id-type="doi">10.1093/bib/bbs017</pub-id></element-citation></ref>
<ref id="b31-BR-20-3-01733"><label>31</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Brown</surname><given-names>TA</given-names></name></person-group><comment>Chapter 10 synthesis and processing of RNA. In: Genomes. 2nd edition Oxford: Wiley-Liss, 2002.</comment></element-citation></ref>
<ref id="b32-BR-20-3-01733"><label>32</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dainat</surname><given-names>J</given-names></name></person-group><comment>AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format (version v0.4.0). Zenodo, 2020.</comment></element-citation></ref>
<ref id="b33-BR-20-3-01733"><label>33</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Needleman</surname><given-names>SB</given-names></name><name><surname>Wunsch</surname><given-names>CD</given-names></name></person-group><article-title>A general method applicable to the search for similarities in the amino acid sequence of two proteins</article-title><source>J Mol Biol</source><volume>48</volume><fpage>443</fpage><lpage>453</lpage><year>1970</year><pub-id pub-id-type="pmid">5420325</pub-id><pub-id pub-id-type="doi">10.1016/0022-2836(70)90057-4</pub-id></element-citation></ref>
<ref id="b34-BR-20-3-01733"><label>34</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sievers</surname><given-names>F</given-names></name><name><surname>Wilm</surname><given-names>A</given-names></name><name><surname>Dineen</surname><given-names>D</given-names></name><name><surname>Gibson</surname><given-names>TJ</given-names></name><name><surname>Karplus</surname><given-names>K</given-names></name><name><surname>Li</surname><given-names>W</given-names></name><name><surname>Lopez</surname><given-names>R</given-names></name><name><surname>McWilliam</surname><given-names>H</given-names></name><name><surname>Remmert</surname><given-names>M</given-names></name><name><surname>S&#x00F6;ding</surname><given-names>J</given-names></name><etal/></person-group><article-title>Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega</article-title><source>Mol Syst Biol</source><volume>7</volume><issue>539</issue><year>2011</year><pub-id pub-id-type="pmid">21988835</pub-id><pub-id pub-id-type="doi">10.1038/msb.2011.75</pub-id></element-citation></ref>
<ref id="b35-BR-20-3-01733"><label>35</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Eddy</surname><given-names>SR</given-names></name></person-group><comment>HMMER development team: HMMER user&#x0027;s guide. Biological sequence analysis using profile hidden Markov models, version 3.3.2. <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://hmmer.org/">http://hmmer.org/</ext-link>. Accessed Nov 2020, 2020.</comment></element-citation></ref>
<ref id="b36-BR-20-3-01733"><label>36</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kanost</surname><given-names>MR</given-names></name><name><surname>Arrese</surname><given-names>EL</given-names></name><name><surname>Cao</surname><given-names>X</given-names></name><name><surname>Chen</surname><given-names>YR</given-names></name><name><surname>Chellapilla</surname><given-names>S</given-names></name><name><surname>Goldsmith</surname><given-names>MR</given-names></name><name><surname>Grosse-Wilde</surname><given-names>E</given-names></name><name><surname>Heckel</surname><given-names>DG</given-names></name><name><surname>Herndon</surname><given-names>N</given-names></name><name><surname>Jiang</surname><given-names>H</given-names></name><etal/></person-group><article-title>Multifaceted biological insights from a draft genome sequence of the tobacco hornworm moth, Manduca sexta</article-title><source>Insect Biochem Mol Biol</source><volume>76</volume><fpage>118</fpage><lpage>147</lpage><year>2016</year><pub-id pub-id-type="pmid">27522922</pub-id><pub-id pub-id-type="doi">10.1016/j.ibmb.2016.07.005</pub-id></element-citation></ref>
<ref id="b37-BR-20-3-01733"><label>37</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>UniProt</surname><given-names>Consortium</given-names></name></person-group><article-title>UniProt: The universal protein knowledgebase in 2023</article-title><source>Nucleic Acids Res</source><volume>51 (D1)</volume><fpage>D523</fpage><lpage>D531</lpage><year>2023</year><pub-id pub-id-type="pmid">36408920</pub-id><pub-id pub-id-type="doi">10.1093/nar/gkac1052</pub-id></element-citation></ref>
<ref id="b38-BR-20-3-01733"><label>38</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Teufel</surname><given-names>F</given-names></name><name><surname>Almagro Armenteros</surname><given-names>JJ</given-names></name><name><surname>Johansen</surname><given-names>AR</given-names></name><name><surname>G&#x00ED;slason</surname><given-names>MH</given-names></name><name><surname>Pihl</surname><given-names>SI</given-names></name><name><surname>Tsirigos</surname><given-names>KD</given-names></name><name><surname>Winther</surname><given-names>O</given-names></name><name><surname>Brunak</surname><given-names>S</given-names></name><name><surname>von Heijne</surname><given-names>G</given-names></name><name><surname>Nielsen</surname><given-names>H</given-names></name></person-group><article-title>SignalP 6.0 predicts all five types of signal peptides using protein language models</article-title><source>Nat Biotechnol</source><volume>40</volume><fpage>1023</fpage><lpage>1025</lpage><year>2022</year><pub-id pub-id-type="pmid">34980915</pub-id><pub-id pub-id-type="doi">10.1038/s41587-021-01156-3</pub-id></element-citation></ref>
<ref id="b39-BR-20-3-01733"><label>39</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hallgren</surname><given-names>J</given-names></name><name><surname>Tsirigos</surname><given-names>KD</given-names></name><name><surname>Damgaard Pedersen</surname><given-names>M</given-names></name><name><surname>Almagro Armenteros</surname><given-names>JJ</given-names></name><name><surname>Marcatili</surname><given-names>P</given-names></name><name><surname>Nielsen</surname><given-names>H</given-names></name><name><surname>Krogh</surname><given-names>A</given-names></name><name><surname>Winther</surname><given-names>O</given-names></name></person-group><comment>DeepTMHMM predicts alpha and beta transmembrane proteins using deep neural networks. bioRxiv: doi: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://doi.org/10.1101/2022.04.08.487609">https://doi.org/10.1101/2022.04.08.487609</ext-link>.</comment></element-citation></ref>
<ref id="b40-BR-20-3-01733"><label>40</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Katz</surname><given-names>K</given-names></name><name><surname>Shutov</surname><given-names>O</given-names></name><name><surname>Lapoint</surname><given-names>R</given-names></name><name><surname>Kimelman</surname><given-names>M</given-names></name><name><surname>Brister</surname><given-names>JR</given-names></name><name><surname>O&#x0027;Sullivan</surname><given-names>C</given-names></name></person-group><article-title>The sequence read archive: A decade more of explosive growth</article-title><source>Nucleic Acids Res</source><volume>50 (D1)</volume><fpage>D387</fpage><lpage>D390</lpage><year>2022</year><pub-id pub-id-type="pmid">34850094</pub-id><pub-id pub-id-type="doi">10.1093/nar/gkab1053</pub-id></element-citation></ref>
<ref id="b41-BR-20-3-01733"><label>41</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Burgin</surname><given-names>J</given-names></name><name><surname>Ahamed</surname><given-names>A</given-names></name><name><surname>Cummins</surname><given-names>C</given-names></name><name><surname>Devraj</surname><given-names>R</given-names></name><name><surname>Gueye</surname><given-names>K</given-names></name><name><surname>Gupta</surname><given-names>D</given-names></name><name><surname>Gupta</surname><given-names>V</given-names></name><name><surname>Haseeb</surname><given-names>M</given-names></name><name><surname>Ihsan</surname><given-names>M</given-names></name><name><surname>Ivanov</surname><given-names>E</given-names></name><etal/></person-group><article-title>The European nucleotide archive in 2022</article-title><source>Nucleic Acids Res</source><volume>51 (D1)</volume><fpage>D121</fpage><lpage>D125</lpage><year>2023</year><pub-id pub-id-type="pmid">36399492</pub-id><pub-id pub-id-type="doi">10.1093/nar/gkac1051</pub-id></element-citation></ref>
<ref id="b42-BR-20-3-01733"><label>42</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rice</surname><given-names>P</given-names></name><name><surname>Longden</surname><given-names>I</given-names></name><name><surname>Bleasby</surname><given-names>A</given-names></name></person-group><article-title>EMBOSS: The European molecular biology open software suite</article-title><source>Trends Genet</source><volume>16</volume><fpage>276</fpage><lpage>277</lpage><year>2000</year><pub-id pub-id-type="pmid">10827456</pub-id><pub-id pub-id-type="doi">10.1016/s0168-9525(00)02024-2</pub-id></element-citation></ref>
<ref id="b43-BR-20-3-01733"><label>43</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jumper</surname><given-names>J</given-names></name><name><surname>Evans</surname><given-names>R</given-names></name><name><surname>Pritzel</surname><given-names>A</given-names></name><name><surname>Green</surname><given-names>T</given-names></name><name><surname>Figurnov</surname><given-names>M</given-names></name><name><surname>Ronneberger</surname><given-names>O</given-names></name><name><surname>Tunyasuvunakool</surname><given-names>K</given-names></name><name><surname>Bates</surname><given-names>R</given-names></name><name><surname>&#x017D;&#x00ED;dek</surname><given-names>A</given-names></name><name><surname>Potapenko</surname><given-names>A</given-names></name><etal/></person-group><article-title>Highly accurate protein structure prediction with AlphaFold</article-title><source>Nature</source><volume>596</volume><fpage>583</fpage><lpage>589</lpage><year>2021</year><pub-id pub-id-type="pmid">34265844</pub-id><pub-id pub-id-type="doi">10.1038/s41586-021-03819-2</pub-id></element-citation></ref>
<ref id="b44-BR-20-3-01733"><label>44</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Varadi</surname><given-names>M</given-names></name><name><surname>Anyango</surname><given-names>S</given-names></name><name><surname>Deshpande</surname><given-names>M</given-names></name><name><surname>Nair</surname><given-names>S</given-names></name><name><surname>Natassia</surname><given-names>C</given-names></name><name><surname>Yordanova</surname><given-names>G</given-names></name><name><surname>Yuan</surname><given-names>D</given-names></name><name><surname>Stroe</surname><given-names>O</given-names></name><name><surname>Wood</surname><given-names>G</given-names></name><name><surname>Laydon</surname><given-names>A</given-names></name><etal/></person-group><article-title>AlphaFold protein structure database: Massively expanding the structural coverage of protein-sequence space with high-accuracy models</article-title><source>Nucleic Acids Res</source><volume>50 (D1)</volume><fpage>D439</fpage><lpage>D444</lpage><year>2022</year><pub-id pub-id-type="pmid">34791371</pub-id><pub-id pub-id-type="doi">10.1093/nar/gkab1061</pub-id></element-citation></ref>
<ref id="b45-BR-20-3-01733"><label>45</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mirdita</surname><given-names>M</given-names></name><name><surname>Sch&#x00FC;tze</surname><given-names>K</given-names></name><name><surname>Moriwaki</surname><given-names>Y</given-names></name><name><surname>Heo</surname><given-names>L</given-names></name><name><surname>Ovchinnikov</surname><given-names>S</given-names></name><name><surname>Steinegger</surname><given-names>M</given-names></name></person-group><article-title>ColabFold: Making protein folding accessible to all</article-title><source>Nat Methods</source><volume>19</volume><fpage>679</fpage><lpage>682</lpage><year>2022</year><pub-id pub-id-type="pmid">35637307</pub-id><pub-id pub-id-type="doi">10.1038/s41592-022-01488-1</pub-id></element-citation></ref>
<ref id="b46-BR-20-3-01733"><label>46</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Waterhouse</surname><given-names>A</given-names></name><name><surname>Bertoni</surname><given-names>M</given-names></name><name><surname>Bienert</surname><given-names>S</given-names></name><name><surname>Studer</surname><given-names>G</given-names></name><name><surname>Tauriello</surname><given-names>G</given-names></name><name><surname>Gumienny</surname><given-names>R</given-names></name><name><surname>Heer</surname><given-names>FT</given-names></name><name><surname>de Beer</surname><given-names>TAP</given-names></name><name><surname>Rempfer</surname><given-names>C</given-names></name><name><surname>Bordoli</surname><given-names>L</given-names></name><etal/></person-group><article-title>SWISS-MODEL: Homology modelling of protein structures and complexes</article-title><source>Nucleic Acids Res</source><volume>46 (W1)</volume><fpage>W296</fpage><lpage>W303</lpage><year>2018</year><pub-id pub-id-type="pmid">29788355</pub-id><pub-id pub-id-type="doi">10.1093/nar/gky427</pub-id></element-citation></ref>
<ref id="b47-BR-20-3-01733"><label>47</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Berman</surname><given-names>HM</given-names></name><name><surname>Westbrook</surname><given-names>J</given-names></name><name><surname>Feng</surname><given-names>Z</given-names></name><name><surname>Gilliland</surname><given-names>G</given-names></name><name><surname>Bhat</surname><given-names>TN</given-names></name><name><surname>Weissig</surname><given-names>H</given-names></name><name><surname>Shindyalov</surname><given-names>IN</given-names></name><name><surname>Bourne</surname><given-names>PE</given-names></name></person-group><article-title>The protein data bank</article-title><source>Nucleic Acids Res</source><volume>28</volume><fpage>235</fpage><lpage>242</lpage><year>2000</year><pub-id pub-id-type="pmid">10592235</pub-id><pub-id pub-id-type="doi">10.1093/nar/28.1.235</pub-id></element-citation></ref>
<ref id="b48-BR-20-3-01733"><label>48</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname><given-names>RC</given-names></name></person-group><article-title>MUSCLE: Multiple sequence alignment with high accuracy and high throughput</article-title><source>Nucleic Acids Res</source><volume>32</volume><fpage>1792</fpage><lpage>1797</lpage><year>2004</year><pub-id pub-id-type="pmid">15034147</pub-id><pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id></element-citation></ref>
<ref id="b49-BR-20-3-01733"><label>49</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Waterhouse</surname><given-names>AM</given-names></name><name><surname>Procter</surname><given-names>JB</given-names></name><name><surname>Martin</surname><given-names>DMA</given-names></name><name><surname>Clamp</surname><given-names>M</given-names></name><name><surname>Barton</surname><given-names>GJ</given-names></name></person-group><article-title>Jalview version 2-a multiple sequence alignment editor and analysis workbench</article-title><source>Bioinformatics</source><volume>25</volume><fpage>1189</fpage><lpage>1191</lpage><year>2009</year><pub-id pub-id-type="pmid">19151095</pub-id><pub-id pub-id-type="doi">10.1093/bioinformatics/btp033</pub-id></element-citation></ref>
<ref id="b50-BR-20-3-01733"><label>50</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Archie</surname><given-names>J</given-names></name><name><surname>Day</surname><given-names>WH</given-names></name><name><surname>Maddison</surname><given-names>W</given-names></name><name><surname>Meacham</surname><given-names>C</given-names></name><name><surname>Rohlf</surname><given-names>FJ</given-names></name><name><surname>Swofford</surname><given-names>D</given-names></name><name><surname>Felsenstein</surname><given-names>J</given-names></name></person-group><comment>The Newick tree format. <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://evolution.genetics.washington.edu/phylip/newicktree.html">http://evolution.genetics.washington.edu/phylip/newicktree.html</ext-link>.</comment></element-citation></ref>
<ref id="b51-BR-20-3-01733"><label>51</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Huson</surname><given-names>DH</given-names></name><name><surname>Scornavacca</surname><given-names>C</given-names></name></person-group><article-title>Dendroscope 3: An interactive tool for rooted phylogenetic trees and networks</article-title><source>Syst Biol</source><volume>61</volume><fpage>1061</fpage><lpage>1067</lpage><year>2012</year><pub-id pub-id-type="pmid">22780991</pub-id><pub-id pub-id-type="doi">10.1093/sysbio/sys062</pub-id></element-citation></ref>
<ref id="b52-BR-20-3-01733"><label>52</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schr&#x00F6;dinger</surname><given-names>LLC</given-names></name></person-group><comment>The PyMOL molecular graphics system. PyMOL, 2023.</comment></element-citation></ref>
<ref id="b53-BR-20-3-01733"><label>53</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Strauss</surname><given-names>BS</given-names></name></person-group><article-title>Frameshift mutation, microsatellites and mismatch repair</article-title><source>Mutat Res</source><volume>437</volume><fpage>195</fpage><lpage>203</lpage><year>1999</year><pub-id pub-id-type="pmid">10592327</pub-id><pub-id pub-id-type="doi">10.1016/s1383-5742(99)00066-6</pub-id></element-citation></ref>
<ref id="b54-BR-20-3-01733"><label>54</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Treangen</surname><given-names>TJ</given-names></name><name><surname>Salzberg</surname><given-names>SL</given-names></name></person-group><article-title>Repetitive DNA and next-generation sequencing: Computational challenges and solutions</article-title><source>Nat Rev Genet</source><volume>13</volume><fpage>36</fpage><lpage>46</lpage><year>2011</year><pub-id pub-id-type="pmid">22124482</pub-id><pub-id pub-id-type="doi">10.1038/nrg3117</pub-id></element-citation></ref>
<ref id="b55-BR-20-3-01733"><label>55</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>West</surname><given-names>MB</given-names></name><name><surname>Wickham</surname><given-names>S</given-names></name><name><surname>Parks</surname><given-names>EE</given-names></name><name><surname>Sherry</surname><given-names>DM</given-names></name><name><surname>Hanigan</surname><given-names>MH</given-names></name></person-group><article-title>Human GGT2 does not autocleave into a functional enzyme: A cautionary tale for interpretation of microarray data on redox signaling</article-title><source>Antioxid Redox Signal</source><volume>19</volume><fpage>1877</fpage><lpage>1888</lpage><year>2013</year><pub-id pub-id-type="pmid">23682772</pub-id><pub-id pub-id-type="doi">10.1089/ars.2012.4997</pub-id></element-citation></ref>
<ref id="b56-BR-20-3-01733"><label>56</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bradley</surname><given-names>K</given-names></name><name><surname>Rieger</surname><given-names>MA</given-names></name><name><surname>Collins</surname><given-names>GG</given-names></name></person-group><article-title>Classification of Australian garlic cultivars by DNA fingerprinting</article-title><source>Aust J Exp Agric</source><volume>36</volume><fpage>613</fpage><lpage>618</lpage><year>1996</year></element-citation></ref>
<ref id="b57-BR-20-3-01733"><label>57</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gonz&#x00E1;lez</surname><given-names>RE</given-names></name><name><surname>Soto</surname><given-names>VC</given-names></name><name><surname>Sance</surname><given-names>MM</given-names></name><name><surname>Camargo</surname><given-names>AB</given-names></name><name><surname>Galmarini</surname><given-names>CR</given-names></name></person-group><article-title>Variability of solids, organosulfur compounds, pungency and health-enhancing traits in garlic (<italic>Allium sativum</italic> L.) cultivars belonging to different ecophysiological groups</article-title><source>J Agric Food Chem</source><volume>57</volume><fpage>10282</fpage><lpage>10288</lpage><year>2009</year><pub-id pub-id-type="pmid">19827749</pub-id><pub-id pub-id-type="doi">10.1021/jf9018189</pub-id></element-citation></ref>
<ref id="b58-BR-20-3-01733"><label>58</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dalbey</surname><given-names>RE</given-names></name><name><surname>Robinson</surname><given-names>C</given-names></name></person-group><article-title>Protein translocation into and across the bacterial plasma membrane and the plant thylakoid membrane</article-title><source>Trends Biochem Sci</source><volume>24</volume><fpage>17</fpage><lpage>22</lpage><year>1999</year><pub-id pub-id-type="pmid">10087917</pub-id><pub-id pub-id-type="doi">10.1016/s0968-0004(98)01333-4</pub-id></element-citation></ref>
<ref id="b59-BR-20-3-01733"><label>59</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Driessen</surname><given-names>AJ</given-names></name><name><surname>Manting</surname><given-names>EH</given-names></name><name><surname>van der Does</surname><given-names>C</given-names></name></person-group><article-title>The structural basis of protein targeting and translocation in bacteria</article-title><source>Nat Struct Biol</source><volume>8</volume><fpage>492</fpage><lpage>498</lpage><year>2001</year><pub-id pub-id-type="pmid">11373615</pub-id><pub-id pub-id-type="doi">10.1038/88549</pub-id></element-citation></ref>
<ref id="b60-BR-20-3-01733"><label>60</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tjalsma</surname><given-names>H</given-names></name><name><surname>Bolhuis</surname><given-names>A</given-names></name><name><surname>Jongbloed</surname><given-names>JD</given-names></name><name><surname>Bron</surname><given-names>S</given-names></name><name><surname>van Dijl</surname><given-names>JM</given-names></name></person-group><article-title>Signal peptide-dependent protein transport in Bacillus subtilis: A genome-based survey of the secretome</article-title><source>Microbiol Mol Biol Rev</source><volume>64</volume><fpage>515</fpage><lpage>547</lpage><year>2000</year><pub-id pub-id-type="pmid">10974125</pub-id><pub-id pub-id-type="doi">10.1128/MMBR.64.3.515-547.2000</pub-id></element-citation></ref>
<ref id="b61-BR-20-3-01733"><label>61</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Briggs</surname><given-names>MS</given-names></name><name><surname>Cornell</surname><given-names>DG</given-names></name><name><surname>Dluhy</surname><given-names>RA</given-names></name><name><surname>Gierasch</surname><given-names>LM</given-names></name></person-group><article-title>Conformations of signal peptides induced by lipids suggest initial steps in protein export</article-title><source>Science</source><volume>233</volume><fpage>206</fpage><lpage>208</lpage><year>1986</year><pub-id pub-id-type="pmid">2941862</pub-id><pub-id pub-id-type="doi">10.1126/science.2941862</pub-id></element-citation></ref>
<ref id="b62-BR-20-3-01733"><label>62</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Suzuki</surname><given-names>H</given-names></name><name><surname>Kumagai</surname><given-names>H</given-names></name><name><surname>Tochikura</surname><given-names>T</given-names></name></person-group><article-title>gamma-Glutamyltranspeptidase from Escherichia coli K-12: Formation and localization</article-title><source>J Bacteriol</source><volume>168</volume><fpage>1332</fpage><lpage>1335</lpage><year>1986</year><pub-id pub-id-type="pmid">2877975</pub-id><pub-id pub-id-type="doi">10.1128/jb.168.3.1332-1335.1986</pub-id></element-citation></ref>
<ref id="b63-BR-20-3-01733"><label>63</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tate</surname><given-names>SS</given-names></name><name><surname>Meister</surname><given-names>A</given-names></name></person-group><article-title>gamma-Glutamyl transpeptidase: Catalytic, structural and functional aspects</article-title><source>Mol Cell Biochem</source><volume>39</volume><fpage>357</fpage><lpage>368</lpage><year>1981</year><pub-id pub-id-type="pmid">6118826</pub-id><pub-id pub-id-type="doi">10.1007/BF00232585</pub-id></element-citation></ref>
<ref id="b64-BR-20-3-01733"><label>64</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Thul</surname><given-names>PJ</given-names></name><name><surname>&#x00C5;kesson</surname><given-names>L</given-names></name><name><surname>Wiking</surname><given-names>M</given-names></name><name><surname>Mahdessian</surname><given-names>D</given-names></name><name><surname>Geladaki</surname><given-names>A</given-names></name><name><surname>Ait Blal</surname><given-names>H</given-names></name><name><surname>Alm</surname><given-names>T</given-names></name><name><surname>Asplund</surname><given-names>A</given-names></name><name><surname>Bj&#x00F6;rk</surname><given-names>L</given-names></name><name><surname>Breckels</surname><given-names>LM</given-names></name><etal/></person-group><article-title>A subcellular map of the human proteome</article-title><source>Science</source><volume>356</volume><issue>eaal3321</issue><year>2017</year><pub-id pub-id-type="pmid">28495876</pub-id><pub-id pub-id-type="doi">10.1126/science.aal3321</pub-id></element-citation></ref>
<ref id="b65-BR-20-3-01733"><label>65</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koonin</surname><given-names>E</given-names></name><name><surname>Galperin</surname><given-names>M</given-names></name></person-group><comment>Chapter 2 evolutionary concept in genetics and genomics. In: Sequence-evolution-function: Computational approaches in comparative genomics. Kluwer Academic, Boston, 2003.</comment></element-citation></ref>
<ref id="b66-BR-20-3-01733"><label>66</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jones</surname><given-names>MG</given-names></name><name><surname>Hughes</surname><given-names>J</given-names></name><name><surname>Tregova</surname><given-names>A</given-names></name><name><surname>Milne</surname><given-names>J</given-names></name><name><surname>Tomsett</surname><given-names>AB</given-names></name><name><surname>Collin</surname><given-names>HA</given-names></name></person-group><article-title>Biosynthesis of the flavour precursors of onion and garlic</article-title><source>J Exp Bot</source><volume>55</volume><fpage>1903</fpage><lpage>1918</lpage><year>2004</year><pub-id pub-id-type="pmid">15234988</pub-id><pub-id pub-id-type="doi">10.1093/jxb/erh138</pub-id></element-citation></ref>
<ref id="b67-BR-20-3-01733"><label>67</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Penninckx</surname><given-names>MJ</given-names></name><name><surname>Jaspers</surname><given-names>CJ</given-names></name></person-group><article-title>Molecular and kinetic properties of purified &#x03B3;-glutamyl transpeptidase from yeast (<italic>Saccharomyces cerevisiae</italic>)</article-title><source>Phytochemistry</source><volume>24</volume><fpage>1913</fpage><lpage>1918</lpage><year>1985</year></element-citation></ref>
<ref id="b68-BR-20-3-01733"><label>68</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Storozhenko</surname><given-names>S</given-names></name><name><surname>Belles-Boix</surname><given-names>E</given-names></name><name><surname>Babiychuk</surname><given-names>E</given-names></name><name><surname>H&#x00E9;rouart</surname><given-names>D</given-names></name><name><surname>Davey</surname><given-names>MW</given-names></name><name><surname>Slooten</surname><given-names>L</given-names></name><name><surname>Van Montagu</surname><given-names>M</given-names></name><name><surname>Inz&#x00E9;</surname><given-names>D</given-names></name><name><surname>Kushnir</surname><given-names>S</given-names></name></person-group><article-title>Gamma-glutamyl transpeptidase in transgenic tobacco plants. Cellular localization, processing, and biochemical properties</article-title><source>Plant Physiol</source><volume>128</volume><fpage>1109</fpage><lpage>1119</lpage><year>2002</year><pub-id pub-id-type="pmid">11891265</pub-id><pub-id pub-id-type="doi">10.1104/pp.010887</pub-id></element-citation></ref>
<ref id="b69-BR-20-3-01733"><label>69</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lancaster</surname><given-names>JE</given-names></name><name><surname>Shaw</surname><given-names>ML</given-names></name></person-group><article-title>Characterization of purified &#x03B3;-glutamyl transpeptidase in onions: Evidence for in vivo role as a peptidase</article-title><source>Phytochemistry</source><volume>36</volume><fpage>1351</fpage><lpage>1358</lpage><year>1994</year></element-citation></ref>
<ref id="b70-BR-20-3-01733"><label>70</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nakano</surname><given-names>Y</given-names></name><name><surname>Okawa</surname><given-names>S</given-names></name><name><surname>Yamauchi</surname><given-names>T</given-names></name><name><surname>Koizumi</surname><given-names>Y</given-names></name><name><surname>Sekiya</surname><given-names>J</given-names></name></person-group><article-title>Purification and properties of soluble and bound gamma-glutamyltransferases from radish cotyledon</article-title><source>Biosci Biotechnol Biochem</source><volume>70</volume><fpage>369</fpage><lpage>376</lpage><year>2006</year><pub-id pub-id-type="pmid">16495652</pub-id><pub-id pub-id-type="doi">10.1271/bbb.70.369</pub-id></element-citation></ref>
<ref id="b71-BR-20-3-01733"><label>71</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ikeda</surname><given-names>Y</given-names></name><name><surname>Fujii</surname><given-names>J</given-names></name><name><surname>Taniguchi</surname><given-names>N</given-names></name><name><surname>Meister</surname><given-names>A</given-names></name></person-group><article-title>Human gamma-glutamyl transpeptidase mutants involving conserved aspartate residues and the unique cysteine residue of the light subunit</article-title><source>J Biol Chem</source><volume>270</volume><fpage>12471</fpage><lpage>12475</lpage><year>1995</year><pub-id pub-id-type="pmid">7759490</pub-id></element-citation></ref>
<ref id="b72-BR-20-3-01733"><label>72</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Okada</surname><given-names>T</given-names></name><name><surname>Suzuki</surname><given-names>H</given-names></name><name><surname>Wada</surname><given-names>K</given-names></name><name><surname>Kumagai</surname><given-names>H</given-names></name><name><surname>Fukuyama</surname><given-names>K</given-names></name></person-group><article-title>Crystal structures of gamma-glutamyltranspeptidase from Escherichia coli, a key enzyme in glutathione metabolism, and its reaction intermediate</article-title><source>Proc Natl Acad Sci USA</source><volume>103</volume><fpage>6471</fpage><lpage>6476</lpage><year>2006</year><pub-id pub-id-type="pmid">16618936</pub-id><pub-id pub-id-type="doi">10.1073/pnas.0511020103</pub-id></element-citation></ref>
<ref id="b73-BR-20-3-01733"><label>73</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Grzam</surname><given-names>A</given-names></name><name><surname>Martin</surname><given-names>MN</given-names></name><name><surname>Hell</surname><given-names>R</given-names></name><name><surname>Meyer</surname><given-names>AJ</given-names></name></person-group><article-title>gamma-Glutamyl transpeptidase GGT4 initiates vacuolar degradation of glutathione S-conjugates in Arabidopsis</article-title><source>FEBS Lett</source><volume>581</volume><fpage>3131</fpage><lpage>3138</lpage><year>2007</year><pub-id pub-id-type="pmid">17561001</pub-id><pub-id pub-id-type="doi">10.1016/j.febslet.2007.05.071</pub-id></element-citation></ref>
<ref id="b74-BR-20-3-01733"><label>74</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ohkama-Ohtsu</surname><given-names>N</given-names></name><name><surname>Radwan</surname><given-names>S</given-names></name><name><surname>Peterson</surname><given-names>A</given-names></name><name><surname>Zhao</surname><given-names>P</given-names></name><name><surname>Badr</surname><given-names>AF</given-names></name><name><surname>Xiang</surname><given-names>C</given-names></name><name><surname>Oliver</surname><given-names>DJ</given-names></name></person-group><article-title>Characterization of the extracellular gamma-glutamyl transpeptidases, GGT1 and GGT2, in Arabidopsis</article-title><source>Plant J</source><volume>49</volume><fpage>865</fpage><lpage>877</lpage><year>2007</year><pub-id pub-id-type="pmid">17316175</pub-id><pub-id pub-id-type="doi">10.1111/j.1365-313X.2006.03004.x</pub-id></element-citation></ref>
<ref id="b75-BR-20-3-01733"><label>75</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ohkama-Ohtsu</surname><given-names>N</given-names></name><name><surname>Zhao</surname><given-names>P</given-names></name><name><surname>Xiang</surname><given-names>C</given-names></name><name><surname>Oliver</surname><given-names>DJ</given-names></name></person-group><article-title>Glutathione conjugates in the vacuole are degraded by gamma-glutamyl transpeptidase GGT3 in Arabidopsis</article-title><source>Plant J</source><volume>49</volume><fpage>878</fpage><lpage>888</lpage><year>2007</year><pub-id pub-id-type="pmid">17316176</pub-id><pub-id pub-id-type="doi">10.1111/j.1365-313X.2006.03005.x</pub-id></element-citation></ref>
<ref id="b76-BR-20-3-01733"><label>76</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Martin</surname><given-names>MN</given-names></name><name><surname>Saladores</surname><given-names>PH</given-names></name><name><surname>Lambert</surname><given-names>E</given-names></name><name><surname>Hudson</surname><given-names>AO</given-names></name><name><surname>Leustek</surname><given-names>T</given-names></name></person-group><article-title>Localization of members of the gamma-glutamyl transpeptidase family identifies sites of glutathione and glutathione S-conjugate hydrolysis</article-title><source>Plant Physiol</source><volume>144</volume><fpage>1715</fpage><lpage>1732</lpage><year>2007</year><pub-id pub-id-type="pmid">17545509</pub-id><pub-id pub-id-type="doi">10.1104/pp.106.094409</pub-id></element-citation></ref>
<ref id="b77-BR-20-3-01733"><label>77</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shaw</surname><given-names>ML</given-names></name><name><surname>Pither-Joyce</surname><given-names>MD</given-names></name><name><surname>McCallum</surname><given-names>JA</given-names></name></person-group><article-title>Purification and cloning of a gamma-glutamyl transpeptidase from onion (<italic>Allium cepa</italic>)</article-title><source>Phytochemistry</source><volume>66</volume><fpage>515</fpage><lpage>522</lpage><year>2005</year><pub-id pub-id-type="pmid">15721943</pub-id><pub-id pub-id-type="doi">10.1016/j.phytochem.2005.01.017</pub-id></element-citation></ref>
</ref-list>
</back>
<floats-group>
<fig id="f1-BR-20-3-01733" position="float">
<label>Figure 1</label>
<caption><p>Workflow of the main steps of the bioinformatics pipeline.</p></caption>
<graphic xlink:href="br-20-03-01733-g00.tif" />
</fig>
<fig id="f2-BR-20-3-01733" position="float">
<label>Figure 2</label>
<caption><p>Genome mapping of AsGGT genes. (A) AsGGT1 is located on chromosome 8: 403,215,661-403,234,269, (B) AsGGT2 on chromosome 5: 617,284,828-617,294,697 and (C) AsGGT3 on chromosome 4: 182,866,703-182,871,895. (D) AsGGT4 is located on chromosome 6: 92,586,715-92,589,081. All four genes consist of seven exons of comparable size, though the size of the corresponding introns varies, resulting in a considerable difference in gene size: AsGGT1 is 18,609 bp, AsGGT2 is 9,870 bp and AsGGT3 is 5,193 bp. AsGGT, <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g01.tif" />
</fig>
<fig id="f3-BR-20-3-01733" position="float">
<label>Figure 3</label>
<caption><p>Discovery of deletions and insertions in the coding region of the AsGGT3 gene. (A) An addition of an adenine appears at position 182,867,809 in the second exon, (B) an addition of a guanine appears at position 182,869,779 in the third exon, (C) a deletion of an adenine appears at position 182,870,778 in the fourth exon and (D) an addition of a guanine appears at position 182,870,929 in the seventh exon. The top line indicates the genomic sequence, the lines in grey indicate the translated sequence in the +1, +2 and +3 open reading frames, and the line in purple indicates the reading frames that correctly correspond to AsGGT3 mRNA. AsGGT, <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g02.tif" />
</fig>
<fig id="f4-BR-20-3-01733" position="float">
<label>Figure 4</label>
<caption><p>Genome mapping of the AsGGT4 gene. AsGGT4 has a size of 2,336 bp and similar to the three already characterized genes (AsGGT1, AsGGT2 and AsGGT3), it consists of seven exons, though unlike the already characterized genes, AsGGT4 is transcribed in the reverse direction. AsGGT, <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g03.tif" />
</fig>
<fig id="f5-BR-20-3-01733" position="float">
<label>Figure 5</label>
<caption><p>Species related to <italic>Allium sativum</italic>, where homologous sequences with AsGGT4 were found. AsGGT, <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g04.tif" />
</fig>
<fig id="f6-BR-20-3-01733" position="float">
<label>Figure 6</label>
<caption><p>Comparison of SignalP signal peptide predictions for the protein sequence of AsGGT3; (A) SignalP revealed a potential signal peptide profile in the original sequence, mainly in the 23-46 amino acid region of AsGGT3, which begins with a methionine. (B) When the first 22 amino acids were removed from the sequence, SignalP revealed the presence of a signal peptide in the truncated sequence, where amino acids 1-7 form its amino terminal region, amino acids 8-18 are hydrophobic and finally amino acids 19-24 form its carboxy terminal end. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g05.tif" />
</fig>
<fig id="f7-BR-20-3-01733" position="float">
<label>Figure 7</label>
<caption><p>Comparison of SignalP signal peptide predictions for the protein sequence of AsGGT4; (A) SignalP predicted a 19 amino acid long signal peptide in the amino terminal coding region of the second exon the initial sequence, implying that the coding sequence of the first exon of AsGGT4 would code for approximately another four amino acids. (B) SignalP predicted a signal peptide of 23 amino acids in the full length AsGGT4 (modified peptide sequence) containing the first CDS, where the first four amino acids form its amino terminal region, amino acids 5-18 are hydrophobic, while amino acids 19-23 form its carboxy terminus. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g06.tif" />
</fig>
<fig id="f8-BR-20-3-01733" position="float">
<label>Figure 8</label>
<caption><p>Comparison between the genomic mappings of AsGGT genes based on manual (top panels) and automatic (bottom panels) gene annotation. Only manual annotation can identify UTRs. The coding regions of (A) AsGGT1 and (B) AsGGT2 are identical to the coding regions obtained by automated genomic annotation; by contrast, automated genomic annotation failed to assign correctly exon boundaries and coding regions of (C) AsGGT3 and (D) AsGGT4. The extracted protein sequence from the automatic annotation of AsGGT3 gene also differed from the one previously reported (<xref rid="b24-BR-20-3-01733" ref-type="bibr">24</xref>) (C).</p></caption>
<graphic xlink:href="br-20-03-01733-g07.tif" />
</fig>
<fig id="f9-BR-20-3-01733" position="float">
<label>Figure 9</label>
<caption><p>Pairwise sequence alignment of the AsGGT4 peptide sequence obtained by automatic annotation (Asa6G00348.1) with the corrected AsGGT4 peptide annotation discovered in the present study. This pairwise sequence alignment indicates that the extracted peptide sequence from the automated annotation is frame-shifted. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g08.tif" />
</fig>
<fig id="f10-BR-20-3-01733" position="float">
<label>Figure 10</label>
<caption><p>Multiple sequence alignment of the protein sequences of the three characterized AsGGT peptides, the newly discovered fourth AsGGT peptide and the <italic>Bacillus licheniformis</italic> glutathione hydrolase proenzyme (Q65KZ6_BACLD). It appears that the five peptide sequences are conserved (apart from their N-terminus sequences), although <italic>Bacillus licheniformis</italic> is a phylogenetically distant species to <italic>Allium sativum</italic>. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g09.tif" />
</fig>
<fig id="f11-BR-20-3-01733" position="float">
<label>Figure 11</label>
<caption><p>Radial phylogram of the peptide sequences of &#x03B3;-glutamyl-transpeptidases of <italic>Allium sativum</italic> and glutathione hydrolase of <italic>Bacillus licheniformis</italic>. The root of the AsGGTs subtree is the point at which the outgroup (Q65KZ6_BACLD) connects to the subtree of the AsGGT peptides. The common ancestor of the four AsGGTs appears to have given rise to two peptides, which subsequently gave rise to the AsGGT1 and the AsGGT2 peptide, and the AsGGT3 peptide and the AsGGT4 peptide. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g10.tif" />
</fig>
<fig id="f12-BR-20-3-01733" position="float">
<label>Figure 12</label>
<caption><p>AsGGT4 structure predicted by SWISS-MODEL. (A) Surface visualization. (B) Cartoon visualization. A &#x03B2;-sandwich structure is surrounded by clusters of interacting &#x03B1;-helixes. The catalytic T372 appears in the active site. The color spectrum in both visualizations reflects the confidence of the prediction. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g11.tif" />
</fig>
<fig id="f13-BR-20-3-01733" position="float">
<label>Figure 13</label>
<caption><p>AsGGT4 structure predicted by AlphaFold. (A) Surface visualization. (B) Cartoon visualization. A &#x03B2;-sandwich structure is surrounded by clusters of interacting &#x03B1;-helixes. The catalytic T372 appears in the active site. An unstructured region corresponds to the sequence of the predicted signal peptide. The color spectrum in both visualizations reflects the confidence of the prediction. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g12.tif" />
</fig>
<fig id="f14-BR-20-3-01733" position="float">
<label>Figure 14</label>
<caption><p>Superposition of the AsGGT4 structures predicted by SWISS-MODEL (green) and AlphaFold (red) in cartoon visualization. The main difference between the two structures is that the signal peptide only appears in the AlphaFold prediction. A flexible loop appears displaced relative to the active site, between the two structures. <italic>Allium sativum</italic> &#x03B3;-glutamyl-transpeptidase.</p></caption>
<graphic xlink:href="br-20-03-01733-g13.tif" />
</fig>
</floats-group>
</article>
