Fig. 3: Identification of rare noncoding variants associated with AD risk.

a Overview of rare noncoding variant analysis using CWAS. b Variants were annotated with 59 terms across five groups, generating 29,917 non-redundant categories. c Volcano plot showing burden enrichment across categories; intergenic categories (red) showed significant enrichment (RR > 1, p < 0.05). A Bonferroni-corrected significance threshold was applied based on 1463 effective tests. d Density plots show the number of significant tests for each phenotype within each noncoding category. The red lines represent nominally significant Aβ-positive-enriched categories. The permuted expected distribution is shown in gray. Significance tested by permutation (n = 1000). e Network of AD-associated intergenic category clusters. Each node represents a cluster of categories. Each node is interconnected based on correlations with disease association presented by normalized z-scores. f The correlation between single annotations and four risk clusters (FDR < 0.05, RR > 1 The four clusters were grouped into three terms (correlation > 0.5). Numbers of variants indicated in parentheses. g Schematic of variant selection within the risk cluster. h Comparison of K-MMSE scores between individuals carrying and not carrying the C25 variant within the Aβ-positive group. n represents the number of samples in each group (C25 carriers: n = 508; non-carriers: n = 400). Box plots show the median line. Box edges mark the 25th and 75th percentiles. Whiskers span 1.5 × the interquartile range. Points beyond are outliers. Two-sided linear regression was used with adjustment for sample covariates. i Example of a C25 variant interacting with multiple genes through Hi-C and overlapping excitatory neuron-specific regulatory elements. j Heatmap of genes linked to risk variants across AD phenotypes, with color gradients indicating log2-fold changes and significance marked by *. Only gene–variant pairs from excitatory neuron subtype (Exc L2–3 CBLN2 LINC02306) shown. Differential expression tested via quasi-likelihood F-test (muscat), FDR adjusted. Source data are provided as a Source Data file. AD, Alzheimer’s disease; CWAS, category-wide association study; FDR, false discovery rate; RR, relative risk; C25, Cluster 25; H3K122ac, acetylate lysine 122 on H3; APP catabolism, regulation of amyloid precursor protein catabolic process (GO:1902991).