Combinations of Genes at the 16p11.2 and 22q11.2 Cnvs Contribute to Neurobehavioral Traits

Mikhail Vysotskiy, Autism Working Group of the Psychiatric Genomics Consortium, Bipolar Disorder Working Group of the Psychiatric Genomics Consortium, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Lauren A. Weiss

Abstract

The 16p11.2 and 22q11.2 copy number variants (CNVs) are associated with neurobehavioral traits including autism spectrum disorder (ASD), schizophrenia, bipolar disorder, obesity, and intellectual disability. Identifying specific genes contributing to each disorder and dissecting the architecture of CNV-trait association has been difficult, inspiring hypotheses of more complex models, such as multiple genes acting together. Using multi-tissue data from the GTEx consortium, we generated pairwise expression imputation models for CNV genes and then applied these elastic net models to GWAS for: ASD, bipolar disorder, schizophrenia, BMI (obesity), and IQ (intellectual disability). We compared the variance in these five traits explained by gene pairs with the variance explained by single genes and by traditional interaction models. We also modeled polygene region-wide effects using summed predicted expression ranks across many genes to create a regionwide score.

Introduction

Copy number variants (CNVs) at 16p11.2 and 22q11.2 contribute to neurobehavioral disorders including autism spectrum disorder (ASD), schizophrenia, bipolar disorder, intellectual disability, and obesity [1–11]. Specific gene-trait contributions at these regions have proven difficult to find. Single-gene fine-mapping approaches have been challenging due to a lack of highly-penetrant point mutations in these genes and inconsistent findings in animal models [12–15]. A potential reason for the lack of clear gene-phenotype relationships is that the architecture may be more complicated than single-gene contributions to each trait [16]. More complex models are good candidates for in silico analysis, as multiple hypotheses can be efficiently assessed in parallel.

Materials and method

Genes studied

We selected genes at the 16p11.2 and 22q11.2 CNV regions that fell into one of these annotation categories: protein-coding, lincRNA, pseudogene, antisense, miRNA. These were consistent with what was used for PrediXcan modeling previously, with miRNA included given the strong representation of miRNAs at 22q11.2 [41, 42]. We included noncoding genes, as they have not received significant attention in studies of these regions, despite some evidence of miRNA contribution to 22q11.2 phenotypes. In addition, we considered flanking genes within 200kb of the region, as there is suggestive evidence of broader transcriptional effects in CNV carriers, and because we previously found evidence of flanking gene involvement in psychosis [22, 27]. S1 and S2 Tables contain single and pairwise CNV genes used in analysis.

Results

We predicted the expression of individual CNV genes (using publicly available elastic net models) and pairs of CNV genes (using elastic net models trained on GWAS SNPs) across GTEx tissues. We also selected matched control regions for comparison with the CNV region. First, we identified significant genes and gene pairs through association analysis with five traits (using the control region genes as a null distribution to test for significance). Next, we compared the trait variance explained by single gene models vs pairwise models, as well as the specific genes with top associations in single gene vs pairwise models. Finally, we used a rank scoring approach to create region-wide scores to test for a polygenic contribution of CNV genes across the region. Fig 1b summarizes this analysis design.

Discussion

Our study aimed to provide insight into the genetic architecture of the 16p11.2 and 22q11.2 copy number variants. We modeled the neurobehavioral trait consequences of pairs of genes expressed in the same direction, extending our previous single-gene analysis (Fig 4). Both 16p11.2 and 22q11.2 had pairs of genes associated with all tested phenotypes based on a permutation-based threshold, however, despite a larger number of genes tested in 22q11.2, the count of associated genes was larger for 16p11.2 gene pairs. We found that for nearly all traits tested, variance in phenotype was better explained by pairs of genes than by single genes or traditional interaction models. The only exception was bipolar disorder at 22q11.2, where single genes explain more variance. However, for schizophrenia, BMI, and IQ at 22q11.2 the pairwise model was not specific to the CNV regions but appeared to be a trait-based property of genetic architecture extending to matched control regions. These findings suggest that the pairwise effects are different between regions.

Acknowledgments

We acknowledge Nancy J. Cox for her contribution to study conception and advice about methodology. Noah Zaitlen provided helpful ideas for testing the utility of pairwise models. This research has been conducted using the UK Biobank Resource under Application Number 47982.

Citation: Vysotskiy M, Autism Working Group of the Psychiatric Genomics Consortium, Bipolar Disorder Working Group of the Psychiatric Genomics Consortium, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Weiss LA (2023) Combinations of genes at the 16p11.2 and 22q11.2 CNVs contribute to neurobehavioral traits. PLoS Genet 19(6): e1010780. https://doi.org/10.1371/journal.pgen.1010780

Editor: Santhosh Girirajan, Pennsylvania State University, UNITED STATES

Received: October 3, 2022; Accepted: May 9, 2023; Published: June 2, 2023

Copyright: © 2023 Vysotskiy et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Individual-level genotypes for Psychiatric Genomics Consortium cohorts can be obtained by applying at https://pgc.unc.edu/for-researchers/data-access-committee/data-access-information/ Summary level data from the PGC is at https://pgc.unc.edu/for-researchers/download-results/. Summary-level genetic datasets for BMI and IQ are available to freely download from GIANT BMI (https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium) and CNCR IQ (https://ctg.cncr.nl/software/summary_statistics). Individual-level UK Biobank data can be obtained by application at https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access PrediXcan single-gene genome-wide models are available to download at predictdb.org. GTEx genotypes and phenotypes are requestable on dbGAP (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000424.v8.p2). Summary statistics from association studies performed in this article are located in S5, S6 and S7 Tables.

Funding: This work was supported by National Institute of Mental Health R01 MH107467 to LAW. The funding body had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.