Colorcon || One Partner
ACROBiosystems - Survey NA

A multi-layer encoder prediction model for individual sample specific gene combination effect (MLEC-iGeneCombo)

Yun Shen, Kunjie Fan, Birkan Gökbağ, Nuo Sun, Chen Yang, Lijun Cheng, Lang Li

Abstract

Using data from gene combination double knockout (CDKO) experiments, top ranked synthetic lethal (SL) gene pairs were highly inconsistent among different SL scores. This leads to a significant concern that SL prediction models highly depend on SL scores. In this paper, we introduce a new gene combination effect (GCE) measurement, log-fold change of dual-gRNA expression before and after CRISPR-cas9 lentivirus transfection.

Introduction

Gene combination effect refers to the result of genetic effect between two genes within a cell. Initially, the concept of gene combination effect was observed in model organisms, including fruit flies, in which researchers noted mutations in the bar or glass genes occurring individually but never together [1–2]. Other gene combination effect examples are paralog genes from the same family frequently share function, such as CCNL1/CCNL2 and CDK4/CDK6 in human lung cells [3]. 

Materials and method

Dataset

Gene combination double knockout (CDKO) experiment data.

We adopted a subset of the dataset obtained from gene combination double knockout (CDKO) experiments [16] in the recent developed synthetic lethality knowledge base (SLKB), which includes 11 CDKO experiments, 22 cell lines, and LFC from 280,488 non-SL gene pairs. 

Results

Multi-omics feature selection in predicting gene combination effect

For the key features for the gene combination score prediction, we compared gene expression, essentiality, and copy number as features for predicting the gene combination score using MLP, which is the same structure as multi-omics encoder. As shown Fig 4A, the y-axis represents the mean correlation between the predicted scores and the ground truth across 18 cell lines. Gene essentiality demonstrated the strongest predictive power for GCEs, followed by gene expression.

Discussion

We developed the MLE-GeneCombo model, which includes multi-omics, network, and cell-line encoders, to predict GCEs in new cells, and we investigated the model’s prediction performance and that of its many sub-models using data of 18 cell lines from CDKO experiments. We showed that our MLE-GeneCombo model performed well in predicting GCEs, with prediction correlation above 80  in eleven of 18 cancer cells.

Acknowledgments

The authors would like to thank the Ohio Supercomputer Center (OSC) for providing computing resources.

Citation: Shen Y, Fan K, Gökbağ B, Sun N, Yang C, Cheng L, et al. (2025) A multi-layer encoder prediction model for individual sample specific gene combination effect (MLEC-iGeneCombo). PLoS Comput Biol 21(10): e1013547. https://doi.org/10.1371/journal.pcbi.1013547

Received: April 6, 2025; Accepted: September 20, 2025; Published: October 3, 2025

Copyright: © 2025 Shen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Our source code is available at https://github.com/karenyun/MLEC-iGeneCombo.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.