Sequence analysis of hepatitis C virus nonstructural protein 3‑4A serine protease and prediction of conserved B and T cell epitopes
- Ayesha Naeem
- Yasir Waheed
- Published online on: October 24, 2017 https://doi.org/10.3892/br.2017.1007
The hepatitis C virus (HCV) is a plus stranded RNA virus belonged to the Flaviviridae family. HCV causes acute and chronic hepatitis, and more than half of HCV patients develop liver cirrhosis or hepatocellular carcinoma (1). Globally, 130–150 million people are living with HCV (2).
Hepatitis C has six major genotypes, which demonstrate many variations in geographic distribution, response to therapy and disease progression (3). The most prevalent genotypes of HCV are 1, 2 and 3, which occur around the globe. HCV genotype 4 is predominant in Africa and the Middle East, while in South Africa and Hong Kong genotypes 5 and 6, respectively are predominant (4). The most prevalent genotype in Pakistan is 3 (3).
There is currently no vaccine available for protection from HCV. Between 2001 and 2011, interferon and ribavirin were administered for HCV treatment. These types of medication resulted in limited responses with many adverse effects. Numerous interferon-free therapeutic strategies are at various stages of development and few of these produce a high response rate with minimum adverse effects (5,6). The prevalence of HCV is particularly high in the multitransfused patient population and in individuals who inject drugs (7,8).
The genome of HCV is comprised of 9,600 nucleotides and encoded as a single polyprotein. The genome is composed of four structural proteins and six non-structural proteins. HCV nonstructural protein 3 (NS3) protein is a 631-amino acid long dual-function protein, with a serine protease domain in the N terminal one-third, while the two-thirds region of the C terminus is RNA helicase domain (9,10). The NS4A region is a cofactor for the activation of the protease subunit of HCV (11). HCV protease has proven to be crucial for viral replication and is considered to be the best target for the development of anti-HCV therapeutic strategies (12). Thus, the aim of the present study was to develop a global consensus sequence of the HCV serine protease and predict conserved B- and T-cell binding epitopes.
Materials and methods
Sequence extraction and translation
A total of 160 complete genome sequences of HCV were extracted from the National Center for Biotechnology Information database (https://www.ncbi.nlm.nih.gov/nuccore/), including all six genotypes of HCV. The HCV genome sequences were trimmed using the NS3-4A HCV-H isolate as the reference sequence. The resulting NS3-4A nucleotide sequences were than translated using CLC main workbench software v.8 (Qiagen GmbH, Hilden, Germany) to their corresponding amino acid sequences.
Development of global consensus sequence
The amino acid sequences were aligned to obtain specific consensus sequences for all six genotypes of HCV. The respective consensus sequences for genotypes 1–6 of HCV NS3-4A were developed utilizing the CLC workbench software. The consensus sequences were subsequently aligned together to obtain a global consensus sequence (13). The global consensus sequence of HCV NS3-4A serine protease was analyzed for its variable residues and highly conserved domains that determine the activity of the serine protease. Short highly conserved peptides were selected from the consensus sequence of HCV NS3-4A.
B-cell and T-cell epitopes prediction
The location of possible B- and T-cell epitopes was mapped in the consensus sequence of the NS3-4A gene. Possible B-cell epitopes in NS3-4A were predicted for antibody binding using the Immune Epitope Database (IEDB) (http://www.iedb.org/). Similarly, target epitopes for T-lymphocytes in NS3-4A were identified for binding to major histocompatibility complex (MHC) class I and II using ProPred-I and ProPred software (http://crdd.osdd.net/raghava//propred/), respectively. The predicted B- and T-cell epitopes in HCV NS3-4A were subjected to conservation analysis in the IEDB epitope conservation analysis tool. The epitopes with 80–100% conservancy were selected and these epitopes were compared with human proteome to confirm that these peptides would not trigger autoimmunity.
Development of global consensus sequence and selection of conserved peptides
A HCV NS3-4A consensus sequence for each genotype was separately drawn. The genotypic consensus sequences were used for the development of the global consensus sequence, which aided with analyzing the conserved amino acids among all the genotypes of HCV.
Small peptide fragments consisting of 9–18 amino acid residues were deduced from the highly conserved regions of the NS3-4A consensus sequence (Table I). These highly conserved residues offer potent target sites for peptide vaccine development or designing site-specific HCV inhibitors.
Position and sequences of the peptides that may be used for development of peptide vaccines.
Prediction of B-cell epitopes
The location of possible B- and T-cell MHC class I and II epitopes was identified in the consensus sequence of the NS3-4A gene and the predicted epitopes were subjected to conservation analysis in the IEDB epitope conservation analysis tool. The selected epitopes were compared with human proteome to confirm that these peptides do not trigger autoimmunity.
Different B-cell epitopes were predicted by IEDB in the NS3-4A gene (Table II). Each epitope is given a distinctive name, from B1 to B13. The positions of the residues, the length of the peptide and the percentage conservancy of the epitope are refracted in Table II.
B-cell epitopes and their conservation in hepatitis C virus nonstructural protein 3–4A sequences from genotypes 1–6.
Among the B-cell epitopes, B1, B7, B8 and B9 are considered to be conserved among all the six genotypes of HCV. These epitopes developed from the global consensus sequence are capable of producing strong neutralizing antibodies against all six genotypes of HCV.
Prediction of MHC class I and II epitopes
In total, 38 different MHC class I epitopes were predicted by ProPred-I. The epitopes with 80–100% conservancy are presented in Table III. Among these epitopes, M4, M5, M7 and M10 were highly conserved in the HCV NS3-4A consensus sequence and all genotypes.
T-cell class I MHC-specific epitopes and their conservation in hepatitis C virus nonstructural protein 3–4A sequences from genotypes 1–6.
Various MHC class II epitopes were predicted from the HCV NS3-4A gene using ProPred software. Certain important epitopes are presented in Table IV. The epitopes T5, T7 and T10 were identified to be 95% conserved among genotypes 1–6 of HCV.
T-cell class II MHC-specific epitopes and their conservation in hepatitis C virus nonstructural protein 3–4A sequences from genotypes 1–6.
The NS3-4A gene of HCV has a highly conserved catalytic triad comprised of His57, Asp81 and Ser139 residues. The catalytic triad is essentially required for the proteolysis of the HCV polyprotein (9). Replacing any of the catalytic triad amino acids, histidine, aspartate or serine with any other amino acid eliminated the proteolytic cleavage by NS3 (14). The consensus sequence alignment demonstrates that the residues His57, Asp81, and Ser139 remained conserved across all HCV genotypes.
Previous X-ray observations and computational modeling analysis have revealed that a zinc-binding site is present opposite to the catalytic triad of HCV protease (11). In the present study, results from the consensus sequence analysis indicated that the zinc-binding site amino acids are well conserved among all HCV genotypes.
NS4A is a 54-amino acid long protein forming a non-covalent heterodimer with protease domain of NS3 (11). Mutation analysis reveals that the N-terminus 22 amino acids of NS3 are involved in the interaction with the central region 21–34 residues of the NS4A protein. Mutations affecting the non-covalent bonding between these two proteins cause a significant decline or inhibition of protease activity, confirming that the configuration of the bonded complex is vital for protease function (15). Numerous amino acids located in the middle section of the NS4A protein develop elaborate hydrophobic interactions with various hydrophobic side-chains in two β strands of NS3 serine protease, forming a sandwich-like configuration between the β-barrels of NS3 and the NS4A cofactor. These hydrophobic amino acid residues of NS4A primarily include Val23, Ile25, Ile29 and Leu31. The consensus sequence analysis reveals that the residues, Val23 and Ile25 are conserved among all HCV genotypes. The residue Ile29 has been replaced by Leu and Val in genotype 2 and 4, respectively, which are similar branched-chain amino acids. However, the residue, Leu31 has been mutated to Thr in genotype 6, which is a significant mutation.
Highly conserved B- and T-cell binding epitopes were predicted from the consensus sequence of HCV. Among B-cell epitopes, certain epitopes demonstrated 8%-100% conservation among all six genotypes of HCV. Various MHC class I and II predicted epitopes exhibited maximum allele-binding affinity confirming them as potential T-cell epitopes. The epitopes B1, B8 and B9 are considered to be the best targets for B cell-based vaccine development and are >95% conserved across six major HCV genotypes. Similarly, M4, M5, M7 and M10 are the best MHC class I epitopes to be adopted as synthetic vaccines against multi-isotypes of HCV. Additionally, epitopes T5, T7 and T10 are ideal MHC class II specific epitopes with high antigenicity scores and high conservancy across major genotypes. In comparison to the epitopes derived from highly variable genome regions, the use of conserved epitopes from the consensus sequence may provide broader protection against HCV. Therefore, these predicted epitopes may be invoked as effective vaccine candidates against major genotypes of HCV.
In conclusion, regardless of numerous variations in the NS3-4A gene sequences from different genotypes of HCV, the functionally important residues of the serine protease and helicase are highly conserved. These regions of the NS3-4A sequence may be useful in developing antiviral agents or peptide vaccines against HCV. Prediction of epitope immunogenicity and characterization on the basis of peptide sequences is important in developing a potent peptide vaccine for HCV. Though as the antigens predicted in the present study were based on computer software analysis, the antigenic potential of the peptides should be further characterized in animal models.
Safi SZ, Badshah Y, Waheed Y, Fatima K, Tahir S, Shinwari A and Qadri I: Distribution of hepatitis C virus genotypes, hepatic steatosis and their correlation with clinical and virological factors in Pakistan. Asian Biomed. 4:253–262. 2010.
Waheed Y: Effect of interferon plus ribavirin therapy on hepatitis C virus genotype 3 patients from Pakistan: Treatment response, side effects and future prospective. Asian Pac J Trop Med. 8:85–89. 2015. View Article : Google Scholar : PubMed/NCBI
Waheed Y, Najmi MH, Aziz H, Waheed H, Imran M and Safi SZ: Prevalence of hepatitis C in people who inject drugs in the cities of Rawalpindi and Islamabad, Pakistan. Biomed Rep. 7:263–266. 2017.PubMed/NCBI
Saeed U, Waheed Y, Ashraf M, Waheed U, Anjum S and Afzal MS: Estimation of hepatitis B virus, hepatitis C virus, and different clinical parameters in the thalaseemic population of capital twin cities of Pakistan. Virology (Auckl). 6:11–16. 2015.PubMed/NCBI
Bartenschlager R, Ahlborn-Laake L, Mous J and Jacobsen H: Nonstructural protein 3 of the hepatitis C virus encodes a serine-type proteinase required for cleavage at the NS3/4 and NS4/5 junctions. J Virol. 67:3835–3844. 1993.PubMed/NCBI
Failla C, Tomei L and De Francesco R: An amino-terminal domain of the hepatitis C virus NS3 protease is essential for interaction with NS4A. J Virol. 69:1769–1777. 1995.PubMed/NCBI
Kim JL, Morgenstern KA, Lin C, Fox T, Dwyer MD, Landro JA, Chambers SP, Markland W, Lepre CA, O'Malley ET, et al: Crystal structure of the hepatitis C virus NS3 protease domain complexed with a synthetic NS4A cofactor peptide. Cell. 87:343–355. 1996. View Article : Google Scholar : PubMed/NCBI
Grakoui A, McCourt DW, Wychowski C, Feinstone SM and Rice CM: Characterization of the hepatitis C virus-encoded serine proteinase: Determination of proteinase-dependent polyprotein cleavage sites. J Virol. 67:2832–2843. 1993.PubMed/NCBI
Waheed Y, Saeed U, Anjum S, Afzal MS and Ashraf M: Development of global consensus sequence and analysis of highly conserved domains of the HCV NS5B protein. Hepat Mon. 12:e61422012.PubMed/NCBI
Eckart MR, Selby M, Masiarz F, Lee C, Berger K, Crawford K, Kuo C, Kuo G, Houghton M and Choo QL: The hepatitis C virus encodes a serine protease involved in processing of the putative nonstructural proteins from the viral polyprotein precursor. Biochem Biophys Res Commun. 192:399–406. 1993. View Article : Google Scholar : PubMed/NCBI
Shimizu Y, Yamaji K, Masuho Y, Yokota T, Inoue H, Sudo K, Satoh S and Shimotohno K: Identification of the sequence on NS4A required for enhanced cleavage of the NS5A/5B site by hepatitis C virus NS3 protease. J Virol. 70:127–132. 1996.PubMed/NCBI