NeuroBACE-ML: A reliability-aware screening framework for high-throughput prioritization of potent BACE1 inhibitors.
Beta-site amyloid precursor protein cleaving enzyme 1 (BACE1) is a key enzyme in amyloid-β generation and remains an important target in Alzheimer's disease (AD) drug discovery. Here, we present NeuroBACE-ML, a reliability-aware screening framework for high-throughput prioritization of potent BACE1 inhibitors from small-molecule libraries. Human BACE1 bioactivity records were curated from ChEMBL and standardized on a pIC50 scale using a strict binary definition to reduce label ambiguity: active (IC50 ≤ 100 nM; pIC50 ≥ 7) and inactive (IC50 ≥ 1 μM; pIC50 ≤ 6), while excluding the intermediate grey zone (100-1000 nM; 6 < pIC50 < 7). Molecules were represented using Morgan fingerprints and the primary classifier was built using XGBoost with Optuna-based hyperparameter optimization. On the fixed random held-out test set, NeuroBACE-ML showed high discriminative performance, with AUROC = 0.986, AUPRC = 0.991, MCC = 0.868 and balanced accuracy = 0.943 at the deployed operating threshold of 0.70 (TN = 468, FP = 14, FN = 71, TP = 763). To strengthen reliability for prospective screening, the framework incorporates probability calibration, scaffold-aware robustness assessment, applicability-domain-aware decision support, abstention logic and ensemble uncertainty analysis. In addition, external validation on an independent non-overlapping BindingDB dataset supported generalizability beyond the ChEMBL-derived benchmark (AUROC = 0.969, AUPRC = 0.987, MCC = 0.790). While the framework is intended for early-stage candidate prioritization rather than direct clinical translation, it provides a practical and deployable tool for identifying high-confidence BACE1 inhibitor candidates for downstream medicinal chemistry and experimental follow-up. The exclusion of intermediate compounds may limit real-world applicability by simplifying borderline activity patterns that can occur in practical screening settings. NeuroBACE-ML is available as a web application at https://neurobace-ml.streamlit.app/with supporting code and deployment resources available via GitHub at https://github.com/kunal74/NeuroBACE-ML.