Emily K. Makowski, John S. Schardt, Matthew D. Smith, Peter M. Tessier
SARS-CoV-2 variants with enhanced transmissibility represent a serious threat to global health. Here we report machine learning models that can predict the impact of receptor-binding domain (RBD) mutations on receptor (ACE2) affinity, which is linked to infectivity, and escape from human serum antibodies, which is linked to viral neutralization. Importantly, the models predict many of the known impacts of RBD mutations in current and former Variants of Concern on receptor affinity and antibody escape as well as novel sets of mutations that strongly modulate both properties. Moreover, these models reveal key opposing impacts of RBD mutations on transmissibility, as many sets of RBD mutations predicted to increase antibody escape are also predicted to reduce receptor affinity and vice versa. These models, when used in concert, capture the complex impacts of SARS-CoV-2 mutations on properties linked to transmissibility and are expected to improve the development of next-generation vaccines and biotherapeutics
The coronavirus pandemic has devastated mankind since 2019, and it is unclear when it will end given that the virus is expected to become endemic. Rapid development, approval, and distribution of vaccines has provided significant protection to vaccinated individuals. However, as both vaccine and infection immunity wanes, new widespread variants with increased transmissibility threaten additional waves of devastation . In particular, mutations in the receptor-binding domain (RBD) of the spike (S1) protein have demonstrated increased transmissibility through multiple mechanisms, including by i) increasing the affinity of the RBD for its cognate receptor, angiotensin-converting enzyme 2 (ACE2) [2–5] and ii) reducing RBD binding to human serum antibodies elicited by natural infection or vaccination [6–9]. For example, increased ACE2 affinity due to RBD mutations has been linked to increased transmissibility for viral lineages carrying the spike protein mutation D614G found in all current and former Center for Disease Control and Prevention (CDC) ‘Variants of Concern’ . Herein, we refer to all current and former CDC Variants of Concern simply as VOCs. Likewise, reduced human serum antibody binding due to RBD mutations is linked to increased transmission for variants with the K417N/T, E484K, and N501Y mutations found in the Beta and Gamma variants, which have been shown to have increased breakthrough infection rates in vaccinated individuals [11, 12]. Either of these two mechanisms, or a combination of both, may result in increased infection rates in unvaccinated or even vaccinated individuals, which has the potential to facilitate additional viral evolution and further increase transmissibility. Therefore, it is of great interest to accurately predict novel RBD mutations (and combinations thereof) that confer increased transmissibility. Such predictions may be useful to inform vaccine and biotherapeutic development and guide global health decisions.
Materials and methods
The dataset for ACE2 affinity was preprocessed for preliminary evaluation by averaging experimental measurements of identical sequences, which resulted in a final dataset of 64,617 RBD mutants . The data was then trimmed to exclude any KA,app measurements below 106 M and above 1013 M. The dataset for human serum antibody escape, comprised of two to three serum samples from 11 convalescent patients more than 30 days after the onset of symptoms, was preprocessed by averaging repeat values for each RBD mutant for a total of 102,723 RBD mutants . Initial model testing showed such average values increased model accuracy, likely due to outlier smoothing.
Tradeoffs between ACE2 affinity and human serum antibody binding
Towards our goal of developing models for predicting the impact of RBD mutations on several key properties linked to transmissibility, we first evaluated two large mutational datasets [2,6]. The first set includes the ACE2 affinities for 64,617 single and multisite RBD mutants, which we used to predict the affinities of mutated RBD sequences for ACE2, and the second set includes the percentage increases in human serum antibody escape for 102,723 single and multisite RBD mutants, which we used to predict the relative binding of human polyclonal antibodies to mutated RBD sequences. The former property (ACE2 affinity) is reported as apparent association constant (KA, app) values because the experimental data were measured using bivalent ACE2 (ACE2-Fc) and the apparent affinities are much higher than those for monovalent ACE2 . The latter property (% antibody escape) is simply 100% minus the percentage of binding of human serum antibodies to a given RBD mutant relative to the wild-type RBD.
As the SARS-CoV-2 pandemic persists and the virus likely becomes endemic, attention must be focused on identifying and managing variants that pose a significant risk to public health. To our knowledge, our models are the first to comprehensively predict the impact of RBD mutations on both ACE2 and human serum antibody affinity. In addition, our models suggest that mutations of the SARS-CoV-2 virus have led to highly transmissible variants that are strongly linked to increased ACE2 affinity and/or human serum antibody escape. This is evidenced by the fact that many RBD mutations identified using our models have already been identified in connection with increased viral transmissibility, particularly L452Q, L452R, T478K, and E484K. Several concerning variants have at least one of these mutations, including Beta, Gamma, Delta, Lambda, Mu, and Omicron.
We thank Tyler Starr, Jesse Bloom and Allison Greaney for reviewing our manuscript and providing helpful feedback. We thank members of the Tessier lab for their assistance editing the manuscript.
Citation: Makowski EK, Schardt JS, Smith MD, Tessier PM (2022) Mutational analysis of SARS-CoV-2 variants of concern reveals key tradeoffs between receptor affinity and antibody escape. PLoSComputBiol 18(5): e1010160. https://doi.org/10.1371/journal.pcbi.1010160
Editor: Jinyan Li, University of Technology Sydney, AUSTRALIA
Received: December 2, 2021; Accepted: May 2, 2022; Published: May 31, 2022
Copyright: © 2022 Makowski et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data and code are available at https://github.com/Tessier-Lab-UMich/Coronavirus_RBD_mutant_ML.
Funding: This work was supported by the National Institutes of Health (RF1AG059723 and R35GM136300 to P.M.T., 1T32GM140223-01 to E.K.M., and F32GM137513 to J.S.S.), National Science Foundation (CBET 1813963, 1605266 and 1804313 to P.M.T., Graduate Research Fellowship to M.D.S.), and the Albert M. Mattocks Chair (to P.M.T). P.M.T. and M.D.S. received salary support from the NSF grants, and M.D.S, J.S.S. and P.M.T. received salary support from the NIH grants. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.