Colorcon || One Partner
ACROBiosystems - Survey NA

An interpretable machine learning framework for adverse drug reaction prediction from drug-target interactions

Abstract

Background

Adverse drug reactions (ADRs) present challenges to patient safety and healthcare systems. Current pharmacovigilance methods, such as the Yellow Card Scheme (YCS), provide valuable post-marketing data, but the mechanistic causes of these ADRs are not fully understood. Leveraging drug-target interaction data with interpretable machine learning offers a promising approach to anticipate ADRs and understand their underlying mechanisms.

Introduction

Adverse drug reactions (ADRs) are unwanted or harmful reactions that occur when a drug is administered correctly, at the recommended dose, to the appropriate patient, and for its intended purpose [1]. ADRs pose a significant challenge to healthcare, affecting patient health, quality of life (QoL), and generating a substantial health economic strain. In the UK, ADRs account for ~16.5% of all in-patient hospital admissions and cost the NHS ~ £2.2 billion annually [1]. 

Materials and method

2.1 Drug interaction data

Drug-target interaction data was obtained from STITCH v5.0 (accessed [16th Jan 2025]) [17]. Data files were downloaded for human interactions and chemical identifiers, providing the information for interactions between drugs and human targets. Each interaction is associated with a confidence score ranging from 0 (no confidence) to 1 (high confidence); interactions with missing data or no supporting evidence were conservatively assigned a score of 0.

Results

3.1 Comparative analysis of clinical and real-world ADR data

The comparison of RWE data to the SIDER database yielded the following results, visualised in a heatmap shown in Fig 3A. To enhance interpretability 9% of the drugs are highlighted. The X-axis shows each SOC model in alphabetical order. Fig 3B represents the distribution of data. Most ADR categories in both datasets were insignificant, as expected.

Discussion

4.1 Comparative Analysis of Clinical and Real-World ADR Data

The comparative analysis between the YCS and SIDER databases reveals a striking divergence in ADR signals. Only 3.3% of significant ADRs cross-over, compared to 7.0% and 8.3% uniquely found in the YCS and SIDER. The low Jaccard index (17.6%) further reinforces the minimal overlap between clinical trials and real-world datasets. 

Conclusion 

The study presents an interpretable machine learning framework that leverages drug-target interaction data to predict ADRs and provide pharmacologically meaningful insights. Using Random Forests trained on curated data from STITCH and the YCS, the framework identifies associations between drugs, targets and ADRs across 21 MedDRA SOCs. 

Citation: Roberts-Nuttall J, Jones AM, Castellani M, Pham D (2026) An interpretable machine learning framework for adverse drug reaction prediction from drug-target interactions. PLoS One 21(1): e0340900. https://doi.org/10.1371/journal.pone.0340900
Editor: Ali Awadallah Saeed, National University, SUDAN

Received: September 3, 2025; Accepted: December 28, 2025; Published: January 30, 2026

Copyright: © 2026 Roberts-Nuttall et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data and code used in this study are publicly available and on GitHub at: https://github.com/Joeroberts1601/Random_Forest_ADR_Prediction All relevant data are within the paper and its Supporting Information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Joseph Roberts-Nuttall, Alan M. Jones, Marco Castellani, Duc Pham