Paola Benaglio, Jacklyn Newsome, Jee Yun Han, Joshua Chiou, Anthony Aylward, Sierra Corban, Michael Miller, Mei-Lin Okino, Jaspreet Kaur, Sebastian Preissl, David U. Gorkin, Kyle J. Gaulton
Gene regulation is highly cell type-specific and understanding the function of non-coding genetic variants associated with complex traits requires molecular phenotyping at cell type resolution. In this study we performed single nucleus ATAC-seq (snATAC-seq) and genotyping in peripheral blood mononuclear cells from 13 individuals. Clustering chromatin accessibility profiles of 96,002 total nuclei identified 17 immune cell types and sub-types. We mapped chromatin accessibility QTLs (caQTLs) in each immune cell type and sub-type using individuals of European ancestry which identified 6,901 caQTLs at FDR < .10 and 4,220 caQTLs at FDR < .05, including those obscured from assays of bulk tissue such as with divergent effects on different cell types. For 3,941 caQTLs we further annotated putative target genes of variant activity using single cell co-accessibility, and caQTL variants were significantly correlated with the accessibility level of linked gene promoters
Genome-wide association studies have identified thousands of genomic loci associated with complex human traits and disease [1–3], but their molecular mechanisms remain largely unknown. Interpreting the mechanisms of trait-associated loci is paramount to an improved understanding of the cell types, genes and pathways involved in complex traits and disease . Genetic variants at complex trait-associated loci are primarily non-coding and enriched in transcriptional regulatory elements [1,4,5], implying that the majority affect gene regulatory programs. As gene regulation is highly cell type-specific [6,7], uncovering the molecular mechanisms of complex trait loci requires determining the function of non-coding variants in the individual cell types that comprise a tissue. While substantial advances have been made in annotating the non-coding genome [5,8], the regulatory effects of genetic variants in specific cell types are still largely unknown.
Materials and method
The studies were approved by the Institutional Review Board (IRB) of the University of California San Diego. The human donors used were anonymous to the authors of this study.
Single nuclei ATAC-seq
Peripheral blood mononuclear cells (PBMCs) from 13 individuals (7 females and 6 males) were purchased from HemaCare (Northridge, CA) and profiled for snATAC-seq using 10x Genomics Chromium Single Cell ATAC Solution, following manufacturer’s instructions (Chromium SingleCell ATAC ReagentKits UserGuide CG000209, Rev A) as described previously . Briefly, cryopreserved PBMC samples were thawed, resuspended in 1 mL PBS (with 0.04% FBS) and filtered with 50 μm CellTrics. Cells were centrifuged and permeabilized with 100 μl of chilled lysis buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20, 0.1% IGEPAL-CA630, 0.01% digitonin and 1% BSA) for 3 min on ice and then washed with 1mL chilled wash buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween- 20 and 1% BSA). After centrifugation, pellets were resuspended in 100 μL of chilled Nuclei buffer (2000153, 10x Genomics) in a final concentration of 3,000 to 7,000 of nuclei per μl. 15,300 nuclei (targeting 10,000) were used for each sample
Chromatin accessibility profiling of peripheral blood mononuclear cells
We performed snATAC-seq and genotyping of human peripheral blood cell (PBMC) samples to map genetic effects on lymphoid and myeloid cell type accessible chromatin (Fig 1A). We used droplet-based snATAC-seq (10X Genomics) to assay 13 PBMC samples from individuals of self-reported European descent (S1 Table, see Methods). The snATAC-seq libraries were sequenced to an average depth of 212M read pairs (24,227 read pairs per nucleus on average), and libraries had consistently high-quality metrics including enrichment at transcription start sites (TSS) and fraction of reads mapping in peaks (S2 Table). We then performed array genotyping of each sample and imputed genotypes into 308M variants in the TOPMed r2 reference panel.
In this study we demonstrated that profiles derived from single nucleus ATAC-seq assays of a heterogeneous tissue can be used to map chromatin accessibility QTLs in individual cell types and sub-types. While only a small number of samples were profiled in our study, we identified thousands of immune cell type and sub-type caQTLs. One reason for the larger number of caQTLs identified is that we performed QTL mapping at the level of each cell type and sub-type, which revealed more caQTLs compared to treating each sample as a ‘bulk’ experiment. Another reason for the larger number of caQTLs identified was the high depth of sequencing per sample, which provided greater power particularly for allelic imbalance mapping. Supporting this, we identified substantially fewer caQTLs when down-sampling our sequence data as well as when performing population-based QTL mapping without the allelic imbalance component. As the number of unique reads covering a variant can in theory be much higher for snATAC-seq compared to bulk ATAC-seq due to having thousands of libraries per assay, the value of snATAC-seq in mapping allelic imbalance is even more pronounced.
Citation: Benaglio P, Newsome J, Han JY, Chiou J, Aylward A, Corban S, et al. (2023) Mapping genetic effects on cell type-specific chromatin accessibility and annotating complex immune trait variants using single nucleus ATAC-seq in peripheral blood. PLoS Genet 19(6): e1010759. https://doi.org/10.1371/journal.pgen.1010759
Editor: Anne O’Donnell-Luria, Broad Institute, UNITED STATES
Received: August 17, 2022; Accepted: April 25, 2023; Published: June 8, 2023
Copyright: © 2023 Benaglio et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Raw sequence data are available in NCBI GEO accession GSE199253 and GSE163160 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE199253; https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE163160). Genotyping data can be accessed upon request through the European Genome-Phenome Archive (EGA) at ID EGAS00001006184 (https://ega-archive.org/datasets/EGAD00010002308). All processed data underlying analyses and figures in the paper are located at https://doi.org/10.5281/zenodo.7375095 and in the supplementary tables where specified. Pipelines and custom code for all analyses and graphs of this manuscript are available in the Github page https://github.com/Gaulton-Lab/pbmc_snATAC.
Funding: This work was supported by National Institute of Diabetes and Digestive and Kidney Diseases awards DK112155, DK120429 and DK122607 to K.J.G. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: Dr. Gaulton has done consulting for Genentech and holds stock in Neurocrine Biosciences. Dr. Benaglio is an employee of Shoreline Bioscience. Dr. Chiou is an employee and shareholder of Pfizer. These affiliations have no competing interest related to the submitted work. The other authors have no competing interests to disclose. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.