Robust validation of neuroimaging and clinical models via the SAR method: A case study based on the ADNI dataset.
Alzheimer's disease (AD) is a progressive neurodegenerative disorder characterized by cognitive decline and substantial brain atrophy. Early and accurate prediction of disease progression and staging is crucial for timely intervention and effective treatment planning. Previous studies, including those based on artificial intelligence techniques, have employed neuroimaging, biomarkers and clinical data to model AD progression; however, many of these approaches rely on strong parametric assumptions or lack robust statistical guarantees regarding model validity. To bridge this gap, this study proposes a novel framework for validating predictive and staging models of disease using a statistically agnostic methodology. The objective is to take the advantages of an unconventional method for robust validation of ML models related to AD. Validation is performed using the Statistical Agnostic Regression (SAR) methodology applied to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. The method tests for a linear relationship by resampling and estimating an upper bound on the expected risk (R) via a Bayesian bound under the worst-case scenario. The SAR power assesses the likelihood of detecting a true linear relationship using the test statistic R, via Monte Carlo simulations under the null distribution. Three predictive models related to structural neuroimaging are assessed: one for the Mini Mental State Examination (MMSE) score, another for the concentration of amyloid beta 1-42 protein in the cerebrospinal fluid, and a third for age. In addition, a model for staging based on Alzheimer's-related clinical groups is explored through the joint analysis of segmented gray matter and white matter images. The findings indicate that the SAR methodology not only facilitates robust validation of predictive ML models related to neuroimaging and AD but also enables an effective staging of the AD continuum. This SAR-proposed framework opens new perspectives for the validation of ML models for early diagnosis and provides a solid foundation for future research in computational neuroscience.