An accurate prognostic staging system for hepatocellular carcinoma patients after curative hepatectomy

The aim of this study was to develop an accurate predictive system for prognosis of hepatocellular carcinoma (HCC) patients after hepatectomy. We pooled data of clinicopathological features of 234 HCC patients who underwent curative hepatectomy. On the basis of the pooled data, we established a simple predictive staging system (PS score) scored by the mathematical product of tumor number and size, and degree of liver function. We compared the prognostic abilities of the PS score (score 0–3) with those of six well-known clinical staging systems. Then, we found that there were significant differences (P<0.05) in both disease-free survival (DFS) and overall survival (OS) between patients with different PS scores (PS score 0 vs. 1; PS score 1 vs. 2), and there was a significant difference in DFS, but not OS, between patients with PS score 2 and those with PS score 3. Moreover, the PS score had smaller values of the Akaike information criterion for both DFS and OS than any of the six well-known clinical staging systems. These results suggest that the PS score serves as a simple, accurate predictor for the prognosis of HCC patients after hepatectomy.


Introduction
Hepatocellular carcinoma (HCC) is one of the most common malignancies worldwide. Liver resection for HCC has the highest local controllability among all local treatments; however, the recurrence rates of HCC remain high even after curative hepatectomy (1)(2)(3). Because of the high recurrence rate and poor prognosis, the prognostic assessment and selection of treatment strategy in HCC patients are quite important (4,5). In particular, a precise stratification system for the prognosis of HCC patients is required in parallel to the advent of effective systemic treatment options (6).
It is well known that the prognosis of HCC depends on both tumor factors (i.e., size and extent of primary tumor) and host factors (i.e., liver function) (7); however, the latter is not integrated in the tumor lymph node metastasis (TNM) staging system, which is generally accepted as a standard approach for prognostication in many cancer clinical staging systems. Even when primary HCC is completely treated, recurrence is observed much more frequently in the form of multicentric carcinogenesis in the residual cirrhotic liver, because the potential for multicentric carcinogenesis increases with the progression of chronic liver disease and liver cirrhosis (2,3). Therefore, from the standpoint of prognosis, a staging system based on information regarding both tumor factors and host factors such as liver function are required to accurately classify HCC patients undergoing various therapeutic options. For this reason, many prognostic staging systems for HCC, such as the Japan Integrated Staging score (JIS score), modified JIS score, the Cancer of the Liver Italian Program (CLIP) score, and the Tokyo score, have been proposed during the last two decades (8)(9)(10)(11). However, the prognostic ability of these systems cannot be universally accepted because of the variation in cohorts and/or subjects. It remains unclear which of the staging systems is most accurate for predicting the prognosis of HCC. Therefore, in the present study, we developed a new staging system that had high accuracy in predicting the recurrence of HCC by driving data for factors used in the known staging systems. the portal or hepatic veins. We excluded 29 patients who had follow-up periods <5 years, and excluded 41 patients who had undergone treatment options such as percutaneous ethanol injection therapy (PEIT), microwave coagulation therapy (MCT), radiofrequency ablation (RFA), or transcatheter arterial chemoembolization (TACE) prior to surgery. Ultimately, 234 patients were enrolled in the study.
Data on tumor factors such as size of the main tumor, number of tumors, tumor differentiation, and vascular invasion were based on the final pathological findings of the resected liver. Laboratory data, including albumin, bilirubin, prothrombin activity, platelet count, indocyanine green retention rate at 15 min (ICG-R15), α-fetoprotein (AFP), and positivity for viral markers (hepatitis B surface antigen and anti-hepatitis C antibody), were obtained before operation. The Child-Pugh classification (12), the degree of Liver Damage classification by the Liver Cancer Study Group of Japan (LCSGJ) (13), TNM staging system (LCSGJ), TNM staging system (Union for International Cancer Control, UICC), JIS score, modified JIS score, CLIP score, and Tokyo score were evaluated using these variables. All patients were followed up after hepatectomy until death or the date of last follow-up visit, and survival was censored in December 2013.
Development of predictive staging system (PS score). We attempted to identify tumor factors that were closely related to the prognosis of HCC after surgery. Subsequently, we constructed a novel multidimensional staging system by combing the identified tumor factor and some host factors.
By analyzing the variables and their combinations that were used in the TNM staging system (data not shown), we found that the mathematical product (NxS factor) of tumor number and size (cm) had high accuracy in predicting recurrence of HCC. We next determined the optimal cut-off values of the NxS factor at 4 and 9 in reference to the Milan criteria (single tumor ≤5 cm in size or ≤3 tumors each ≤3 cm in size) (14). The cut-off points of the NxS factor at 4 and 9 classified the 234 patients into 3 groups. All HCC patients with NxS factor <4 were within the Milan criteria and had a low incidence of recurrence. Almost all patients with NxS factor >9 were outside the Milan criteria and had a high recurrence rate. The remaining HCC patients with NxS factor 4-9 were considered to be at intermediate risk for recurrence of HCC (data not shown).
Among the host factors used in four staging systems (JIS score, m-JIS score, CLIP score, and Tokyo score), we searched for the factors with the highest prognostic probability in the postoperative clinical course of HCC patients in combination with the NxS factor.
Statistics. Disease-free survival (DFS) and overall survival (OS) curves were plotted with the kaplan-Meier method. Differences in DFS and OS between the groups were compared by using a log-rank test in univariate analysis. Variables that had statistical significance (P<0.05) for DFS in the univariate analysis were subsequently entered into a multivariate Cox proportional hazards model. Then, we established a new prognostic scoring system (PS score) by using the combination of factors that retained significance in multivariate analysis. To compare the prognostic ability of each staging system, the Akaike information criterion (AIC) (15) within a Cox proportional hazards regression model was used as a measure of relative goodness-of-fit. The AIC statistic was defined by AIC = -2 log likelihood + 2 x the number of parameters in the model and -2 log likelihood calculated using the Cox model. Therefore, a smaller AIC value indicates a more desirable model for predicting outcome (9,10,(16)(17)(18). For all tests, P<0.05 was considered significant. Statistical analysis was performed using JMP version 9.0 (SAS Institute Japan, Tokyo, Japan) and the Statistical Package for Social Sciences version 11 (SPSS Japan, Tokyo, Japan).

Results
Patient characteristics. The baseline characteristics of the 234 HCC patients are shown in Table I There were statistical differences in DFS and OS between the two groups divided according to whether they fit the Milan criteria ( Fig. 1A and B). The 1-year DFS rates were 88.2, 69.2 and 56.5%, the 3-year DFS rates were 54.6, 34.6 and 13.0%, and the 5-year DFS rates were 42.7, 23.1 and 10.9% in patients with NxS factor <4, NxS factor 4-9 and NxS factor >9, respectively ( Fig. 1C and D). There were statistical differences in DFS and OS among the groups classified by NxS factor (P<0.0001) (Fig. 1C and D).
Log-rank analysis for DFS identified NxS factor, microscopic portal vein invasion, microscopic hepatic vein invasion, Child-Pugh classification, and degree of Liver Damage as significant prognostic factors (Table II). DFS was not associated with tumor differentiation and serum AFP levels. Multivariate analysis of the five variables that achieved statistical significance in the univariate analysis for DFS revealed that NxS factor and the degree of Liver Damage classification were independent predictors (Table II). Notably, the hazard ratio (HR) of the degree of Liver Damage classification showed higher values (HR 1.93; 95%CI 1.37-2.70; P=0.0002) in comparison with other host factors.
PS score. Given that the NxS factor and the degree of Liver Damage classification were independent risk factors for HCC prognosis by multivariate analysis, we constructed the PS score by combining the NxS factor with the degree of Liver Damage classification. The score for the NxS factor can be easily obtained by allocating NxS factor <4, 4-9 and >9 to Liver Damage scores 0, 1 and 2, respectively (Table III). The score for the degree of Liver Damage classification can be similarly obtained by allocating Liver Damage A and B to scores 0 and 1, respectively (Table III).
The kaplan-Meier estimated DFS curves and OS curves according to PS score, TNM staging system (LCSGJ), TNM Table II. Univariate analysis (log-rank test) and multivariate analysis (Cox proportional hazards model) of variables potentially predictive of disease-free survival in patients with HCC (n=234).    system (UICC), CLIP score, JIS score, m-JIS score and Tokyo score] (Table IV).

Discussion
We have developed a novel clinical scoring system, the PS score, for accurate prediction of the outcome of HCC patients after surgery by combining tumor factors and liver function. The most important findings of the present study were that there were significant differences in DFS between HCC patients with different PS scores (score 0 vs 1; score 1 vs. 2; score 2 vs. 3), and the PS score had smaller values of the AIC for DFS than any of the six well-known clinical staging systems. Such a precise stratification for DFS in HCC patients after surgery will provide a better understanding of the metastatic potentials of HCC, which will become more and more essential in parallel with the advent of effective systemic treatment options including molecular-targeted agents (6). Our successful result was largely due to the integration of tumor number and the size of the largest tumor into the NxS factor; tumor number and size of largest tumor were each used as the main parameters in the TNM staging systems. The Milan criteria (14) hinted that we could increase the predictive power of the outcome of HCC patients if we applied the Milan criteria to our system. We found that the NxS factor had high accuracy in predicting HCC recurrence by setting the optimal two cut-off values based on the Milan criteria. Notably, the NxS factor can be obtained from modalities such as computed tomography (CT) or magnetic resonance imaging (MRI). Therefore, our newly-developed PS score is an easyto-use preoperative assessment tool because the information on pathological vessel involvement is not needed, which is one of the parameters for the conventional TNM staging systems and is integrated into the JIS score and the modified JIS score (7,16,19).
Unsuccessful preoperative assessment causes a potential limitation in these clinical staging systems due to discrepancy between the pre-and postoperative status of HCC patients (19). In this regard, the PS score is fascinating because the score can be determined preoperatively via several imaging modalities. HCC patients have various backgrounds and divergent clinical courses, resulting in the lower predictive power of many clinical staging systems (20)(21)(22). In particular, liver function is considered to be important from the viewpoint of multicentric carcinogenesis and a necessity of additive surgical therapies (23,24).
Recently, several staging systems for HCC based on information of both tumor factors and liver function have been proposed. The CLIP score has been successful in discriminating HCC patients with regard to OS (11, 25) and has been well validated in predicting the prognosis of HCC patients (26)(27)(28). Nevertheless, the CLIP score functioned poorly in predicting OS and DFS in Japanese HCC patients enrolled in the present study (Figs. 2F and 3F). One possible explanation of this deficiency is that the criteria of 50% for tumor extension in the liver used in the CLIP score system failed to accurately capture features of HCC in Japan, where many smaller tumors are detected based on the established screening system for HCC.
To overcome the deficiency of the CLIP score, kudo et al developed the JIS score, composed of the Child-Pugh classification and the TNM staging system of LCSGJ. Kudo et al reported that this system had superior prognostic capabilities regarding OS in HCC patients when compared with the CLIP score (10,29). Thereafter, the JIS score gradually came to be used for the evaluation of treatment modalities in Japan.
In the present study, the JIS score worked poorly in predicting DFS and OS of HCC patients undergoing hepatic resection because most of them had Child-Pugh A disease. There was less advantage with the JIS score, which was made by combining two systems, the Child-Pugh classification and the TNM staging system of the LCSGJ (8,16). Therefore, many Japanese liver surgeons have not considered the JIS score to be effective in HCC patients who have undergone hepatic resection (30).
More recently, Nanashima et al developed the m-JIS score, which was constructed using the degree of Liver Damage and the TNM staging system of the LCSGJ. These investigators showed that the m-JIS score was a better predictor of prognosis than the JIS score in HCC patients who underwent hepatic resection (8,16). Ikai et al validated this system using the records of 42,269 HCC patients (31). In the present study, the m-JIS score demonstrated better stratification for prognosis after surgery and a smaller AIC value than did the JIS score. This finding is supported by the concept that the assessment of the degree of Liver Damage classification in the m-JIS score, in which ICG-R15 was substituted for encephalopathy in the Child-Pugh classification, could evaluate and classify liver function more precisely than the Child-Pugh classification.
ICG-R15 (representing indocyanine green clearance) has been used in the field of surgery routinely in Japan as a useful marker of hepatic function and as a gauge for decisionmaking regarding the permitted volume of hepatic resection (32,33). Moreover, ICG-R15 is a significant prognostic factor in HCC patients (34)(35)(36). The substitution of ICG-R15 for encephalopathy in the PS score might have reflected individual differences more accurately among patients who underwent hepatic resection in the present study, because none of them had any encephalopathy before operation.
In addition, the PS score had a smaller AIC value than did the m-JIS score with respect to both DFS and OS. This finding is supported by the concept that the NxS factor was more prognostic than TNM staging by the LCSGJ for tumor characteristics. The difference between the PS score and the m-JIS was confined to the tumor factor because the degree of Liver Damage was incorporated in both the PS score and the m-JIS score as liver function.
The Tokyo score is a new prognostic scoring system for patients who are candidates for radical therapy, such as percutaneous ablation or surgical resection (9). This system consists  Component  AIC  of the score  --------------------------------------------------------------------------Model  Tumor  Liver  DFS  OS  factor  of four factors: tumor size, number of tumor nodules, serum albumin and bilirubin. These factors can be obtained from laboratory data or images before surgery. Although the Tokyo score can also become a clinical staging system for HCC patients to predict their prognosis before operation, this system has demonstrated poorer stratification than did the PS score in the present study. The Tokyo score may have been especially limited in its ability to stratify patients with an advanced score (Tokyo scores 4-6) in the present study, because most patients who had advanced cancer with poor liver function in this study were not subjected to hepatic resection.
In the present study, we were unable to evaluate the Barcelona Clinic Liver Cancer (BCLC) staging system (37), Chinese University Prognostic Index (38), or other recently reported systems (17,39), because portal pressure, performance status, and alkaline phosphatase levels were not fully recorded for the patients enrolled in the present study. Although portal hypertension is accepted as a strong predictor of poor prognosis (40,41), formal measurement of hepatic venous pressure gradient is a special examination that is not routinely carried out in daily practice. However, all of the parameters used by the PS score can be easily and safely obtained before the operation.
In conclusion, the present study showed that the PS score was superior to any of the well-known systems, including CLIP, JIS, modified-JIS, and the Tokyo score in predicting the prognosis of HCC patients.
There were several limitations in the present study: it was a retrospective single-center study that enrolled only patients who underwent curative hepatectomy. In this regard, further studies will be needed to evaluate whether the robustness of the PS score in predicting prognosis could be maintained in a cohort in which the majority of the subjects were HCC patients who received non-surgical treatment.