Human papillomaviruses and breast cancer: A systematic review and meta‑analysis

    • Charalampos Karachalios
    • Stamatios Petousis
    • Chrysoula Margioula‑Siarkou
    • Konstantinos Dinas
  • Published online on: December 22, 2023
  • Article Number: 75
  © Karachalios et al. This is an open access article distributed under the terms of Creative Commons Attribution License.

Breast cancer (BC) is the leading malignancy worldwide. The association between human papillomavirus (HPV) and BC is debatable. The present systematic review and meta‑analysis aimed to assess the prevalence of HPV DNA in malignant breast tumors. An extensive search of the PubMed and SCOPUS databases was carried out for case‑control studies published between January 1, 2003 and January 7, 2023, which compared HPV DNA detection in breast tissue specimens of female patients with BC and women with absent or benign breast disorders. Once the initial title/abstract screening was completed by two independent investigators, the full texts of the included studies from that stage were reviewed by the aforementioned investigators to determine if they should be included in the present study. Data extraction was independently conducted by two investigators. A third investigator was consulted to resolve disagreements through free discussion. MedCalc was used for quantitative synthesis. The significance of association was estimated by pooled odds ratios (ORs) with 95% confidence intervals (CIs) calculated using the random‑effects model. A total of 23 primary studies, including 3,243 subjects (2,027 patients and 1,216 controls), were eligible for quantitative analysis. HPV prevalence in patients with BC and controls was 21.95 and 8.96%, respectively. The prevalence of HPV differed significantly between the two groups (OR 3.83; 95% CI 2.03‑7.25; P<0.01). Heterogeneity among studies was quantified using the I2 index which was 69.57% (95% CI 51.89‑80.75). The risk of bias was assessed using an appropriate tool contributed by the CLARITY Group at McMaster University. Seven studies had a low risk of bias, 15 studies had a moderate risk of bias and only one study had a serious risk of bias. These results reinforce the hypothesis that HPV is involved in BC development and progression, indicating a possible role of HPV vaccination in BC prevention.


Breast cancer (BC) is globally considered as a marked disease burden (1). It is not only the most prevalent, accounting for one third of cancer cases in female patients, but the breast is also the malignancy site associated with the majority of cancer-related deaths in developed and developing countries (24). BC incidence rates have steadily increased internationally over the last 10 years by ~20% (4,5). According to the Global Cancer Observatory of the International Agency for Research on Cancer of the World Health Organization, a total of 2,261,419 new BC cases were newly diagnosed in 2020, and a total of 684,996 deaths were attributed to BC. Among neoplastic diseases, the incidence of BC ranked 1st worldwide, even surpassing that of lung cancer (2,206,771 new cases in 2020) (6).

A vast variety of genetic, environmental and lifestyle parameters have been described over time as possible risk factors which likely contribute to mammary gland tumorigenesis (7,8). Viral etiology of breast carcinoma has been addressed in numerous studies, but still remains controversial (9). Several viruses, including bovine leukemia virus, Kaposi's sarcoma-associated herpesvirus, Epstein-Barr virus (EBV), mouse mammary tumor virus (MMTV), simian vacuolating virus 40 (10), human mammary tumor virus (11), cytomegalovirus (CMV) (12), herpes simplex virus-1 (HSV), human herpes virus type-8 (13) have been regarded as potential breast oncogenic factors a number of years ago. In 1944, the discovery that MMTV caused BC in mice led researchers to investigate a possible viral contribution to BC (14). In 1990, it was reported that HPVs are capable of immortalizing healthy mammary epithelial cells and decrease their need on growth factors (15). Since Di Lonardo first demonstrated in 1992 the potential relationship between HPV infection and BC by detecting HPV-16 DNA in ~30% of breast and lymph node samples (16), an increasing number of studies have reported the detection of HPV DNA in patients with BC (4,1719). HPV+ breast tumors exhibit more aggressive characteristics than HPVbreast tumors, such as occurring at younger age, having a higher grade of malignancy, showing an estrogen receptorstatus and having a higher Ki67 index (20).

The current published literature on HPV types and BC is divergent, as the prevalence of HPV infection in BC specimens varies (range, 0–86.21%) (5,21). The study designs, population differences and HPV detection techniques chosen are variable, and thus, provide inconsistent results (15). The purpose of the present systematic review and meta-analysis was to address the prevalence of HPV DNA in BC tissues compared with that in healthy and benign mammary specimens obtained from controls, thus determining the possible association between HPV infection and breast tumorigenesis.

Materials and methods

Data sources and search strategy

PubMed ( and SCOPUS ( databases were searched for published, full-text English journal articles including case-control studies of >19 years of age, adult female individuals published between January, 1 2003 and January, 7 2023 and which evaluated the association between HPV DNA and BC. Relevant studies were identified using combinations of the terms ‘HPV’ or ‘human papillomavirus’ in combination with ‘breast cancer’, ‘breast carcinoma’ or ‘mammary carcinoma’. The reference lists of pertinent articles were further manually searched for potentially eligible studies. The ‘Related Citations’ tool in PubMed was also applied whenever a suitable article was included. The exact search strings for every database can be found in Data SI. The search was designed using the Systematic Review Accelerator (22). The present systematic review was conducted in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines (23).

Inclusion and exclusion criteria

The inclusion criteria were: i) Case-control studies; ii) adult patients aged >19 years histologically diagnosed with BC, irrespective to tumor stage, grade and histologic type; iii) controls with a histologic confirmation of normal breast tissue or benign breast disorders; and iv) HPV DNA detected by polymerase chain reaction (PCR). The exclusion criteria were: i) Non-case-control studies including books or book chapters, conference abstracts, theses, press articles, expert, narrative and systematic reviews, editorials/letters to the editor, medical hypotheses, case reports and case series as well as cohort studies; ii) presence of concomitant disease site(s) with precancerous or cancerous lesions; iii) HPV genome extracted from samples other than breast tissue, such as milk and blood; iv) patients and/or non-patients with prior neoplasia, surgery for malignancy, radiation or cytotoxic therapy; v) absence of healthy/benign controls/specimens; vi) co-presence of other oncogenic viruses; vii) HPV presence detected by methods other than PCR; and viii) low quality papers, as identified by the use of Critical Appraisal Skills Program Checklists (24).

Study screening and selection

Two investigators, CK and SP, independently selected the articles which met the inclusion criteria. Once the initial title/abstract screening was completed, the full texts of the included studies from that stage were reviewed by the aforementioned investigators to determine if they should be included. Discrepancies were resolved by consulting a third investigator, KD. The aforementioned stage was completed with the assistance of the automation tools Screenatron and Disputatron (22).

Data extraction

Study characteristics and data outcomes from every study were recorded in a data extraction form. Data extraction was independently conducted by two researchers, CK and SP. A third researcher, CMS, was consulted to resolve disagreements through open discussion. The following data regarding study characteristics and outcomes were extracted from every included study: First author, publication year, country of origin, type of samples, number of BC samples, most prevalent BC histological type, mean/median patient age, number of HPV+ BC samples and most prevalent HPV type in patients with BC.

Quality assessment

The quality of included studies was evaluated using the Critical Appraisal Skills Program Case Control Study Checklist (Table SI) (24). This list was divided into three sections and contained 11 questions on validity, effect and generalizability of the study. Every question could be answered with a ‘Yes (Y)’, a ‘No (N)’, or an ‘I can't tell (?)’. Every ‘Y’ counted as 1 point, while ‘N’ and ‘?’ counted as 0 points. A score range of 0–4 indicated low quality, 5–7 indicated moderate quality and 8–11 high quality. Only high and moderate quality papers were included in the present meta-analysis. CK and SP independently evaluated the quality of the included studies. Disagreements were resolved by discussion with CMS and KD.

Statistical analysis

Odds ratios (ORs) with 95% confidence intervals (CIs) were implemented as measures of strength association for every case-control study. Heterogeneity of the included studies was analyzed using the Q-test, which quantitatively evaluates whether such heterogeneity affected the results. Heterogeneity was quantified using the I2-test. According to the Cochrane Handbook for Systematic Reviews of Interventions (version 6.3) (25), an I2 range of 0–40% implied that heterogeneity might not be significant, an I2 range of 30–60% may have represented moderate heterogeneity, an I2 range of 50–90% may have represented substantial heterogeneity and an I2 range of 75–100% indicated considerable heterogeneity of studies. When studies demonstrated moderate heterogeneity (I2>50%), the random-effects model was used for analysis. Therefore, due to the several differences of subjects among the included case-control studies, such as ethnicity of participants and PCR methodology, data analysis was performed using the random-effects model. The meta-analysis was conducted using MedCalc (version 20.210; MedCalc Software Ltd.), which combined effect size and created a forest plot. To evaluate the influence of individual studies on the overall estimate, sensitivity analysis was performed. Egger's linear regression test was used to evaluate potential publication bias. P<0.05 was considered to indicate a statistically significant difference.


Collection and selection of studies

A PRISMA flowchart of the study selection is depicted in Fig. S1. A total of 159 records, 142 from PubMed and 17 from SCOPUS databases, were initially identified. Using the Deduplicator tool (23), five duplicates were removed by browsing titles and abstracts. No further records were manually retrieved from reference lists. The remaining 154 articles were assessed for eligibility, and 79 of them were excluded using the title and abstract because they were not pertinent to the search. The remaining 75 articles were then assessed for eligibility, and 52 records were excluded based on the aforementioned exclusion criteria (Table SII) (4,9,10,14,19,2673). The final step of eligibility criteria implementation (Table SIII) resulted in a total of 23 original case-control studies being included in the present meta-analysis (Table SIV).

Baseline characteristics of the included studies and assessment of risk of bias

The 23 included studies involved a total of 3,243 breast tissue samples, collected between 2004 and 2022 from nine countries worldwide (Table SIV) (3,7,18,21,7492). A total of six continental regions (Asia, Europe, Oceania, North Africa and South America) were represented. Iran was the major representative (n=6; 26.09%). Five studies did not mention mean patient age, and two studies mentioned median age instead. The mean patient age of the remaining 16 studies was 51.34 years of age (years of age range, 35.2–59.0). The specimen types used for HPV detection were as follows: A total of 11 studies (47.83%) used formalin-fixed paraffin-embedded (FFPE) tissues, one study (4.35%) used fresh tissues, six studies (26.09%) used fresh frozen tissues, one study (4.35%) used frozen tissues and the rest four studies (17.39%) used paraffin-embedded (PE) tissues. In all of the case-control studies, 445 tissues in the BC group (2,027 cancer specimens) and 109 tissues in the control group (1,216 normal and benign specimens) were found to be HPV+. Therefore, the overall HPV prevalence in the present systematic review was 17.08% (n=554/3,243), the HPV DNA prevalence in patients with BC was 21.95% (n=445/2,027), and the HPV DNA prevalence in controls was 8.96% (n=109/1,216). Invasive ductal carcinoma (IDC) was the most prevalent BC type reported (n=17/23; ~74%). The range of IDC prevalence was 31.08–97.33%. The oncogenic HPV-16 high-risk type was the most frequently detected HPV type (n=9/23; 39.13%) (21). Assessment of risk of bias was performed with the assistance of the ‘Tool to Assess Risk of Bias in Case Control Studies’ contributed by the CLARITY Group at McMaster University (Table SIII). A total of seven studies had a low risk of bias, 15 studies a moderate risk of bias and only one study had a serious risk of bias.


The results of the pooled analysis for the 23 included studies are summarized in the forest plot (Fig. S2). The random-effects model was applied since I2=69.57% (95% CI, 51.89–80.75) (93). The summary OR was 3.83 (95% CI, 2.03–7.25; P<0.001). Since heterogeneity was substantial due to a Q value of 62.43 with 19 degrees of freedom and P<0.0001, the impact of publication bias was examined by conducting Egger's linear regression test. The P-value of the Egger's test was 0.26 (P>0.05), confirming the absence of publication bias.


Summary of evidence

The present systematic review and meta-analysis suggested that there is a statistically significant difference between the prevalence of HPV DNA extracted from BC specimens and that of HPV DNA isolated from breast tissue samples of female patients with either benign breast disorders or healthy breasts (Table SV). Studies published in the last 20 years confirm the controversial nature of the alleged association of HPV with BC. There are reports which favor the impact of HPVs in BC oncogenesis (21,75,78,8082,84,89), while other studies have confirmed a limited or nonexistent role of HPVs in the development of BC (3,7,18,74,76,77,79,83,8588,9092).

Interpretation of results

The current meta-analysis confirms the wide variations existing among the included studies. Various reasons have been described for the aforementioned discrepancy, which not only affects the possibility of an association of HPV with BC, but also the distribution of the HPV genotypes: Geographic region(s) of the samples, genetic background, cultural variations such as variable sexual practices and patterns, sample types such as fresh versus FFPE samples, age of the selected populations, sensitivity of the detection methods used, condition of the preserved samples that can hamper DNA quality, the histological type of BCs (IDC versus ductal adenocarcinoma in situ), the absence of an international standard assay for HPV genotyping and the unintentional contamination of samples with the genome of other potentially oncogenic viruses, such as EBV, MMTV, CMV and HSV (2,4,7,13,31,9498).

The mean patient age in the present systematic review was 51.34 years. Although BC incidence increases with advancing age, HPV prevalence decreases as time goes by (7). Younger women usually tend to be more sexually active than older ones, risking infection with HPV (83). As a result, the average age of women with HPV+ BC is notably lower than that of women with HPVBC (96). Estrogen expression, which is higher in younger women compared with that in older counterparts, acts as a co-factor to HPV infection, modulating oncogenesis (78). The need to vaccinate young girls against HPV is evident.

A total of ~50% of the studies included in the present systematic review have isolated HPV DNA from FFPE specimens. Technical aspects can also modify the detection rate of the HPV genome and expression in BC. DNA extracted from PE tissues is of poor quality due to partial degradation (2,7). When HPV DNA is extracted from PE tissues, HPV detection rates can be overestimated compared with those from fresh tissues (15). Moreover, the commercial kits used for DNA extraction and amplification by PCR, as well as the specific type of PCR used in every study, nested or quantitative PCR, differed among the included studies. Furthermore, it is considered a challenge to completely eliminate the possibility of specimen contamination with other oncogenic viruses, such as EBV (10), which could lead to carcinogenesis caused by these viruses and not by HPV (98). Moreover, the low viral loads detected in breast specimens impede the identification of HPV DNA in patients with BC (99). When cell transformation is completed, viral replication seizes and the viral genome is incorporated into the host genome with HPV DNA copies being reduced following integration (3). The ‘concentration’ of HPV, that is copy number of HPV/mg of tumor, in breast tumors appeared to be lower than its concentration in cervical cancer, which may make the detection of HPV in BC harder, despite amplification by PCR (100).

Another marked finding of the present systematic review is that the most prevalent HPV subtypes in patients with BC reported in four studies published in the last 10 years were low-risk HPVs (LR-HPVs) (75,78,82,83). These studies showed that 43.17% of patients with BC were LR-HPV DNA+ (117/271). All of the included studies originated from lower income countries of the Middle East and South America. Future studies could investigate the potential role of LR-HPVs in breast carcinogenesis, and define the association of potential risk factors, such as low socioeconomical status, with the development of LR-HPV+ BC.

Comparison with previous published literature

The overall HPV prevalence in the present systematic review was 17.08% (554/3243), the HPV DNA prevalence in patients with BC was 21.95% (445/2027), and the HPV DNA prevalence in controls was 8.96% (109/1216). These results are mainly in agreement with an Iranian systematic review which analyzed 61 Iranian studies and concluded that HPV prevalence in BC was 14% (101). By contrast, the aforementioned numerical findings of the present review are lower than those of a systematic review from Iran which examined 11 studies from around the globe and concluded that the HPV DNA prevalence in 858 Iranian patients with BC from six Iranian studies was 23.6% (102). In 2018, another meta-analysis from Belgium, which included 40 studies and a total of 4,762 patients with BC, estimated a 20% pooled prevalence of HPV in BC tissue specimens (99). A previous systematic review from Brazil examined 29 studies from five continents and concluded that HPV DNA prevalence in patients with BC was 23.0 and 12.9% in 1,932 cases and 279 controls, respectively (100). Li et al (15) conducted a meta-analysis of 21 studies from 17 countries and four continents, and revealed that the overall HPV prevalence in 1,184 BC cases was 24.49%. It is unclear whether the implementation of HPV vaccination has led to a possible decrease of HPV+ BC.


The limitations of the present systematic review and meta-analysis include discrepancies of the included studies in their design, PCR methodology, such as exact PCR conditions and primers used, and HPV genotyping techniques. Moreover, the present review did not include older studies published prior to 2004. Studies with a substantial number of cases but without controls were excluded from the present review, while eight case-control studies of moderate quality were included. Furthermore, the authors of the included studies were not contacted for additional details such as Tumor-Node-Metastasis stage, tumor grade and histopathological diagnosis to increase the possible subgroup analyses.


HPVs seem to be involved in breast tumorigenesis. Future studies with larger sample sizes will shed more light on the role of HPVs in BC, given the burden of this disease to society. If the role of HPVs in BC is established, HPV vaccines could be used for BC prevention and immunotherapy.

The abstract of the current study has been previously published (


Funding: No funding was received.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Authors' contributions

CK was involved in the conceptualization of the study, methodology used, formal analysis, investigation and preparation of the original draft. SP was involved in the methodology, data validation, writing, reviewing and editing the study, and project supervision/CMS validated the data, carried out investigation, wrote, reviewed and edited the manuscript, and completed project supervision. KD was involved in the conceptualization of the study, data visualization, supervision and project administration. Data authentication is not applicable. CK and SP confirm the authenticity of all the raw data. All authors have read and approved the final version of the manuscript.

Ethics approval and consent to participate

Not applicable.

Patient consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.





breast cancer


confidence interval




Epstein-Barr virus


formalin-fixed paraffin-embedded


human papillomavirus


herpes simplex virus-1


invasive ductal carcinoma


mouse mammary tumor virus


odds ratio


polymerase chain reaction




Preferred Reporting Items for Systematic Reviews and Meta-Analysis



