Abstract (english) | Background: Hepatocellular carcinoma (HCC) occurs mostly in people with chronic liver disease and ranks sixth in terms of global instances of cancer, and fourth in terms of cancer deaths for men. Despite that abdominal ultrasound (US) is used as an initial test to exclude the presence of focal liver lesions and serum alpha-foetoprotein (AFP) measurement may raise suspicion of HCC occurrence, further testing to confirm diagnosis as well as staging of HCC is required. Current guidelines recommend surveillance programme using US, with or without AFP, to detect HCC in high- risk populations despite the lack of clear benefits on overall survival. Assessing the diagnostic accuracy of US and AFP may clarify whether the absence of benefit in surveillance programmes could be related to under-diagnosis. Therefore, assessment of the accuracy of these two tests for diagnosing HCC in people with chronic liver disease, not included in surveillance programmes, is needed. Objectives: Primary: the diagnostic accuracy of US and AFP, alone or in combination, for the diagnosis of HCC of any size and at any stage in adults with chronic liver disease, either in a surveillance programme or in a clinical setting. Secondary: to assess the diagnostic accuracy of abdominal US and AFP, alone or in combination, for the diagnosis of resectable HCC ; to compare the diagnostic accuracy of the individual tests versus the combination of both tests ; to investigate sources of heterogeneity in the results. Search methods: We searched the Cochrane Hepato- Biliary Group Controlled Trials Register, the Cochrane Hepato-Biliary Group Diagnostic-Test- Accuracy Studies Register, Cochrane Library, MEDLINE, Embase, LILACS, Science Citation Index Expanded, until 5 June 2020. We applied no language or document-type restrictions. Selection criteria: Studies assessing the diagnostic accuracy of US and AFP, independently or in combination, for the diagnosis of HCC in adults with chronic liver disease, with cross- sectional and case-control designs, using one of the acceptable reference standards, such as pathology of the explanted liver, histology of resected or biopsied focal liver lesion, or typical characteristics on computed tomography, or magnetic resonance imaging, all with a six-months follow-up. Data collection and analysis: We independently screened studies, extracted data, and assessed the risk of bias and applicability concerns, using the QUADAS-2 checklist. We presented the results of sensitivity and specificity, using paired forest- plots, and tabulated the results. We used a hierarchical meta-analysis model where appropriate. We presented uncertainty of the accuracy estimates using 95% confidence intervals (CIs). We double-checked all data extractions and analyses. Main results: We included 373 studies. The index- test was AFP (326 studies, 144, 570 participants) ; US (39 studies, 18, 792 participants) ; and a combination of AFP and US (eight studies, 5454 participants). We judged at high-risk of bias all but one study. Most studies used different reference standards, often inappropriate to exclude the presence of the target condition, and the time-interval between the index test and the reference standard was rarely defined. Most studies with AFP had a case-control design. We also had major concerns for the applicability due to the characteristics of the participants. As the primary studies with AFP used different cut-offs, we performed a meta-analysis using the hierarchical-summary-receiver-operating- characteristic model, then we carried out two meta-analyses including only studies reporting the most used cut-offs: around 20 ng/mL or 200 ng/mL. AFP cut-off 20 ng/mL: for HCC (147 studies) sensitivity 60% (95% CI 58% to 62%), specificity 84% (95% CI 82% to 86%) ; for resectable HCC (six studies) sensitivity 65% (95% CI 62% to 68%), specificity 80% (95% CI 59% to 91%). AFP cut-off 200 ng/mL: for HCC (56 studies) sensitivity 36% (95% CI 31% to 41%), specificity 99% (95% CI 98% to 99%) ; for resectable HCC (two studies) one with sensitivity 4% (95% CI 0% to 19%), specificity 100% (95% CI 96% to 100%), and one with sensitivity 8% (95% CI 3% to 18%), specificity 100% (95% CI 97% to 100%). US: for HCC (39 studies) sensitivity 72% (95% CI 63% to 79%), specificity 94% (95% CI 91% to 96%) ; for resectable HCC (seven studies) sensitivity 53% (95% CI 38% to 67%), specificity 96% (95% CI 94% to 97%). Combination of AFP (cut-off of 20 ng/mL) and US: for HCC (six studies) sensitivity 96% (95% CI 88% to 98%), specificity 85% (95% CI 73% to 93%) ; for resectable HCC (two studies) one with sensitivity 89% (95% CI 73% to 97%), specificity of 83% (95% CI 76% to 88%), and one with sensitivity 79% (95% CI 54% to 94%), specificity 87% (95% CI 79% to 94%). The observed heterogeneity in the results remains mostly unexplained, and only in part referable to different cut-offs or settings (surveillance programme compared to clinical series). The sensitivity analyses, excluding studies published as abstracts, or with case-control design, showed no variation in the results. We compared the accuracy obtained from studies with AFP (cut-off around 20 ng/mL) and US: a direct comparison in 11 studies (6674 participants) showed a higher sensitivity of US (81%, 95% CI 66% to 90%) versus AFP (64%, 95% CI 56% to 71%) with similar specificity: US 92% (95% CI 83% to 97%) versus AFP 89% (95% CI 79% to 94%). A direct comparison of six studies (5044 participants) showed a higher sensitivity (96%, 95% CI 88% to 98%) of the combination of AFP and US versus US (76%, 95% CI 56% to 89%) with similar specificity: AFP and US 85% (95% CI 73% to 92%) versus US 93% (95% CI 80% to 98%). Authors' conclusions: In the clinical pathway for the diagnosis of HCC in adults, AFP and US, singularly or in combination, have the role of triage-tests. We found that using AFP, with 20 ng/mL as a cut-off, about 40% of HCC occurrences would be missed, and with US alone, more than a quarter. The combination of the two tests showed the highest sensitivity and less than 5% of HCC occurrences would be missed with about 15% of false-positive results. The uncertainty resulting from the poor study quality and the heterogeneity of included studies limit our ability to confidently draw conclusions based on our results. |