Laura Moody, Guanying Bianca Xu, Yuan-Xiang Pan, Hong Chen
Heterogeneity of cancer means many tumorigenic genes are only aberrantly expressed in a subset of patients and thus follow a bimodal distribution, having two modes of expression within a single population. Traditional statistical techniques that compare sample means between cancer patients and healthy controls fail to detect bimodally expressed genes. We utilize a mixture modeling approach to identify bimodal microRNA (miRNA) across cancers, find consistent sources of heterogeneity, and identify potential oncogenic miRNA that may be used to guide personalized therapies. Pathway analysis was conducted using target genes of the bimodal miRNA to identify potential functional implications in cancer. In vivo overexpression experiments were conducted to elucidate the clinical importance of bimodal miRNA in chemotherapy treatments. In nine types of cancer, tumors consistently displayed greater bimodality than normal tissue. Specifically, in liver and lung cancers, high expression of miR-105 and miR-767 was indicative of poor prognosis. Functional pathway analysis identified target genes of miR-105 and miR-767 enriched in the phosphoinositide-3-kinase (PI3K) pathway, and analysis of over 200 cancer drugs in vitro showed that drugs targeting the same pathway had greater efficacy in cell lines with high miR-105 and miR-767 levels. Overexpression of the two miRNA facilitated response to PI3K inhibitor treatment. We demonstrate that while cancer is marked by considerable genetic heterogeneity, there is between-cancer concordance regarding the particular miRNA that are more variable. Bimodal miRNA are ideal biomarkers that can be used to stratify patients for prognosis and drug response in certain types of cancer.
Cancer is classically characterized by genomic instability and mutations which drive uncontrolled cellular growth, heightened angiogenesis and metastasis, and metabolic abnormalities . However, mutational and transcriptomic profiles are tumor-specific, resulting in a high degree of heterogeneity among cancer patients. Not only is there considerable heterogeneity between cancer types, but even two tumors within the same cancer type and stage often display very different genetic profiles. Transcriptomic heterogeneity is exemplified by bimodal gene expression. Bimodal genes can be defined as those having two modes of expression within the same population. Bimodal genes act as molecular switches that define cancer subtypes. One example of bimodal gene expression is the estrogen receptor (ESR1) in breast cancer. In regards to ESR1 expression, breast tumors have one of the two molecular subtypes, one that expresses ESR1 and the other shuts down expression (ER-). These discoveries have been particularly informative not only in prognostic prediction but also in guiding treatment regimens and understanding the efficacy of hormone-based therapies and drugs that target specific receptors [2–4]. Thus, bimodal genes represent a set of tumorigenic genes which can motivate effective therapeutics.
RNA-seq data for methodological validation was downloaded from Genomic Data Commons (GDC, formerly TCGA). We decided to test our model on breast cancer mRNA expression data (Project ID: TCGA-BRCA), given the large number of patient tumor (n = 1,102) and control samples (n = 113), as well as the extensive characterization of specific bimodal genes in previous literature (e.g. estrogen receptor 1 (ESR1), human epidermal growth factor receptor 2 (HER2 or ERBB2), progesterone receptor (PGR), etc.) [74, 75]. RNA-seq data were normalized according to gene length and total number of reads mapped, such that values were expressed as reads per kilobase of transcript per million mapped reads (RPKM).
For identifying novel tumorigenic miRNA, miRNA-seq data was downloaded from GDC for nine types of tumors: breast (Project ID: TCGA-BRCA; n = 1,096), head and neck (H&N; Project ID: TCGA-HNSC; n = 523), kidney (Project ID: TCGA-KIRC, TCGA-KIRP; n = 835), liver (Project ID: TCGA-LIHC; n = 372), lung (Project ID: TCGA-LUAD, TCGA-LUSC; n = 997), prostate (Project ID: TCGA-PRAD; n = 498), stomach (Project ID: TCGA-STAD; n = 446), thyroid (Project ID: TCGA-THCA; n = 506), and uterine cancer (Project ID: TCGA-UCEC, TCGA-UCS, TCGA-SARC; n = 861). Additionally, miRNA-seq data was downloaded from GDC for normal tissue, including breast (n = 104), head and neck (n = 44), kidney (n = 105), liver (n = 50), lung (n = 91), prostate (n = 52), stomach (n = 45), thyroid (n = 59), and uterine (n = 30). These nine cancers were chosen due to data availability. Tumor samples were pooled from all stages and both genders.
Controlled mixture modeling validation
We first sought to validate a method that could reliably identify bimodal expression. In order to reduce false positives and focus only on genes that are relevant to cancer, bimodality was assessed using model-based clustering in both the cancer and control samples. Normalized RNA-seq data from breast tumors and non-tumor mammary tissue was downloaded from Genomic Data Commons (GDC, formerly TCGA). The data was then log2 transformed for further analysis. All stages were included in the analysis for a total of 1,102 tumor samples and 113 non-tumor samples.
In the present study, a novel methodology was applied to identify bimodal miRNA across cancer types. To our knowledge, this is the first study to analyze the functional role of bimodal miRNA. Furthermore, we are the first to investigate large-scale bimodal expression patterns across cancers using next-generation sequencing data from clinical samples. We showed that high levels of bimodal miRNA expression was characteristic of cancer. Furthermore, several bimodal miRNA were common to multiple cancer types, suggesting that certain miRNA consistently account for tumor heterogeneity and may be involved in general oncogenic processes. The relevance of these bimodal miRNA was confirmed by showing that they could be used to predict overall survival and drug response. To illustrate the importance of bimodal miRNA, we specifically focused on miR-105 and miR-767. The two miRNA were bimodally expressed in liver and lung cancer, and high expression was indicative of poor prognosis. Furthermore, high miR-105 and miR-767-expressing cells responded better to PI3K inhibiting drugs. We demonstrate that bimodal miRNA are viable biomarkers in cancer and may equip physicians to better understand the patient prognosis and devise effective treatment strategies.
The authors acknowledge Dr SuparnaMantha of the Cancer Center at the Carle Foundation Hospital, for the helpful advice and constructive suggestions in conceptualizing the study. The authors thank the members of Pan and Chen laboratories for their technical support and helpful discussions throughout the study.
In the current study, we identified bimodal miRNA across cancer types and showed how they could be used for patient stratification based on prognosis and drug response in several types of cancer. We first devised an approach for identifying bimodal miRNA and applied our model to miRNA-seq data. We are the first to examine genome-wide bimodal expression across cancers using sequencing data. We found that certain miRNA were bimodally expressed in multiple cancer types, suggesting that they may be associated with general oncogenic characteristics. Specifically, we examined the importance of miR-105 and miR-767 in predicting overall survival in liver and lung cancer as well as facilitating sensitivity to PI3K inhibiting drugs. Our study provides a framework for finding bimodal expression and demonstrates the role of bimodal miRNA in tumorigenesis as well as their potential in predicting patient survival and enabling effective treatment.
Citation: Moody L, Xu GB, Pan Y-X, Chen H (2022) Genome-wide cross-cancer analysis illustrates the critical role of bimodal miRNA in patient survival and drug responses to PI3K inhibitors. PLoSComputBiol 18(5): e1010109. https://doi.org/10.1371/journal.pcbi.1010109
Editor: Yue Li, McGill University Faculty of Science, CANADA
Received: May 15, 2021; Accepted: April 15, 2022; Published: May 31, 2022
Copyright: © 2022 Moody et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Codes used to calculate bimodality index, to generate histograms, and to plot survival curves have been deposited in GitHub (https://github.com/nutrigenelab/bimodality).
Funding: The research is supported by the following grants: grants from the USDA Cooperative State Research, Education and Extension Service (Hatch project numbers # ILLU-971-344 and ILLU-698-369) to HC, a grant from Cancer Scholars for Translational and Applied Research (C*STAR) program from Carle Foundation Hospital, and the Office of the Vice Chancellor for Research in University of Illinois at Urbana-Champaign to LM and YXP. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.