Downward bias in the association between APOE and Alzheimer's disease using prevalent and by-proxy disease sampling in the All of Us research program.
BACKGROUND: Recent genome-wide association studies for Alzheimer’s Disease and related dementias (ADRD) have increased statistical power via larger analysis datasets from biobanks by (1) including non-age-matched controls and prevalent cases, and/or (2) including individuals who report a family history of ADRD as proxy cases. However, these methods have the potential to increase noise and distort genetic associations which are important for genomic-informed prevention and treatment of ADRD. Here, we sought to understand how the effect sizes of genetic associations in ADRD could be sensitive to these methodological choices, using APOE genotypes as an example. METHODS: Participants in the All of Us Research Program over the age of 49 at enrollment (n = 229,722) were assigned one of four categories: incident ADRD (developed after enrollment in All of Us), prevalent ADRD (present on enrollment), proxy ADRD (participant noted a family history of ADRD), and control (no history or diagnosis of ADRD). ADRD diagnoses were determined using available electronic health records and APOE genotype was determined using whole-genome sequencing. Effect sizes for the associations between APOE risk alleles and ADRD diagnoses were compared using polychotomous logistic regression and presented as adjusted generalized ratios (AGR). RESULTS: The mean age of the cohort was 64 ± 9 years, and it was 57% female; 65% clustered predominantly with European genetic reference populations. Among the participants, 733 (0.3%) had prevalent ADRD, 684 (0.3%) had incident ADRD, and 19,186 (8.4%) reported a family history of ADRD (proxy ADRD). The effect size for APOE ε4 heterozygote was similar for proxy ADRD (AGR [95% CI]: 2.10 [1.96–2.24]) but attenuated for prevalent ADRD (1.38 [1.17–1.63]) compared to incident ADRD (2.13 [1.81–2.50]). For APOE ε4 homozygotes, the effect sizes were significantly attenuated in both proxy (3.53 [2.93–4.26]) and prevalent (3.12 [2.20–4.45]) ADRD. Furthermore, APOE and ADRD association effect sizes increased when restricting the control (no ADRD) group to older age brackets. CONCLUSIONS: Our study highlights how genetic associations with ADRD can be sensitive to how cases are defined in biobanks like All of Us, with effect sizes downwardly biased when using prevalent or by-proxy cases compared to incident cases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12920-026-02362-1.