Machine learning model predicts liver cancer risk from routine blood tests and health records

Written by Joseph Nordqvist/March 26, 2026 at 8:27 AM UTC

4 min read
An abstract visualization of branching decision tree paths converging on a single point of light in warm amber and deep red tones against a dark background
Image Generated with Google Gemini

Researchers have developed a machine learning model that predicts an individual's risk of hepatocellular carcinoma (HCC), the most common form of liver cancer, using routine clinical information already collected during standard medical visits.

The study, published in Cancer Discovery, found that the model significantly outperformed all publicly available risk scores on both internal and external validation datasets.

HCC is the fifth most common cancer globally and the third leading cause of cancer-related death. Current screening guidelines primarily target patients with confirmed liver cirrhosis, but the study's analysis of UK Biobank data revealed that 69% of the 538 HCC cases occurred in patients who had no prior diagnosis of cirrhosis, viral hepatitis, or other chronic liver disease. This means the majority of cases are developing in people who might not be flagged for screening under current cirrhosis-focused criteria.

The research team, led by Jan Clusmann at the Technical University of Dresden, Carolin Schneider at RWTH Aachen University, and Jakob Kather at TU Dresden, built their framework using data from two large population-based cohorts: the UK Biobank (over 500,000 individuals) for model development and the US-based All of Us Research Program (over 400,000 individuals) for external validation.

“Our study highlights the potential of a simple, easily utilized machine learning model to improve risk stratification for HCC using only routinely collected clinical data,” said Carolin Schneider.

How the model works

The team trained random forest classifiers on five types of clinical data, tested both independently and in stepwise combinations: demographics, electronic health records, blood test results, genomics, and metabolomics. Random forests aggregate multiple decision trees, each making simple binary decisions on patient variables, with the final prediction determined by the collective output.

The best-performing model combined demographics, electronic health records, and routine blood tests, achieving an area under the receiver operating characteristic curve (AUROC) of 0.88. Adding genomic or metabolomic data provided only marginal improvement.

The researchers then reduced the model's complexity through ablation experiments. A simplified version using just 15 routinely collected clinical features still outperformed all existing risk prediction scores, including FIB-4, APRI, NFS, and the aMAP score. The top features driving predictions included liver enzymes (AST, ALT), platelet count, diabetes status, waist circumference, age, and liver cirrhosis diagnosis.

Outperforming existing tools

The aMAP score, the best-performing existing tool, achieved an AUROC of 0.79 in the general population. The ML models achieved 3 to 10 times higher precision than existing linear risk scores, a critical metric for rare-event prediction where false positives carry significant clinical cost.

On precision-recall curves, which are particularly important for imbalanced datasets like cancer screening, existing scores performed poorly (area under the precision-recall curve of 0.00 to 0.02), while the ML models reached substantially higher values.

Using the model's three-tier risk classification system, over 70% of HCC cases were classified into the high-risk group in both the general population and the patients-at-risk subgroup. The high- and medium-risk groups together captured approximately 88% of cases.

External validation and generalizability

External validation on the All of Us cohort, which has substantially greater ethnic diversity than the UK Biobank, showed the model maintained comparable performance. Despite being trained predominantly on data from white participants (94% of the UK Biobank cohort), the model showed no significant performance gap between white and non-white subgroups in the more diverse All of Us population.

The model did show better performance for male than female patients, a finding the authors attribute to the higher prevalence of HCC in males and a potentially less prominent HCC phenotype in female patients. This performance gap was less pronounced in the All of Us cohort.

Practical deployment

The researchers have released all code and model weights openly and provide three deployment options: an interactive web calculator on Hugging Face for single-patient inference, a Python package for batch processing, and compatibility with agentic workflows through the model context protocol (MCP).

The study has several limitations. It relies on a retrospective design, includes a low fraction of patients with viral hepatitis (a major HCC risk factor globally), and has not yet been validated in Asian populations where viral hepatitis prevalence is higher.

The authors note that prospective clinical trials will be needed before the model can be recommended for clinical adoption.

The study was supported by German Cancer Aid, the German Federal Ministry of Research, Technology and Space, the German Research Foundation, and several other European and US funding bodies.

Disclaimer

This article was written with the assistance of Claude by Anthropic and Gemini by Google, as part of AI News Home's commitment to transparency in AI-assisted journalism. All analysis, conclusions, and editorial decisions were made by human editors. Read our Editorial Guidelines

Joseph Nordqvist

Written by

Joseph Nordqvist

Joseph founded AI News Home in 2026. He studied marketing and later completed a postgraduate program in AI and machine learning (business applications) at UT Austin’s McCombs School of Business. He is now pursuing an MSc in Computer Science at the University of York.

View all articles →

References

  1. 1.
    Machine learning predicts hepatocellular carcinoma risk from routine clinical data: a large population-based multicentric studyJan Clusmann, Paul-Henry Koop, David Y. Zhang, Felix van Haag, Omar S.M. El Nahhas, Tobias Seibel, Laura Žigutytė, Apichat Kaewdech, Julien Calderaro, Frank Tacke, Tom Luedde, Daniel Truhn, Tony Bruns, Kai Markus Schneider, Jakob Nikolas Kather, Carolin V. Schneider, Cancer Discovery (AACR), March 26, 2026
    PrimaryDOI
  2. 2.
  3. 3.

Was this useful?