血浆蛋白质组学特征在阿尔茨海默病诊断中的突破:媲美ATN生物标志物的精准度与跨平台验证
《Annals of Clinical and Translational Neurology》:Plasma Proteomic Signatures for Alzheimer's Disease: Comparable Accuracy to ATN Biomarkers and Cross-Platform Validation
【字体:
大
中
小
】
时间:2025年10月16日
来源:Annals of Clinical and Translational Neurology 3.9
编辑推荐:
本研究开发了一种基于11种血浆蛋白(如ApoE、BNP)和协变量(年龄、性别、APOEε4)的阿尔茨海默病(AD)诊断模型,在ADNI和Ace队列中分别达到93.5%和95.2%的准确率,展现了与ATN(Aβ/tau/NFL)生物标志物相当的诊断效能,并成功实现跨平台(Luminex xMAP与SOMAscan)验证,为AD的血液生物标志物(BBMs)研究和临床应用提供了重要依据。
研究背景显示,血浆蛋白质组学在阿尔茨海默病(AD)风险评估和疾病表征方面具有潜力,但不同平台间的差异带来了跨平台适用性的不确定性。研究目标是识别用于区分AD与认知正常(CN)的详细血浆生物特征,以及另一个用于分类轻度认知障碍(MCI)衰退者和非衰退者的特征,并探索这些模型在两个蛋白质组学平台之间的跨平台适用性。
研究使用来自阿尔茨海海默病神经影像倡议(ADNI)的566名参与者的190种血浆分析物,通过Luminex xMAP平台测量,采用弹性网络方法建模MCI稳定/衰退和AD/CN分类。MCI衰退者定义为随访期间(平均4.2±3.2年)进展为AD的个体。外部跨平台验证使用西班牙Ace研究的1303名参与者,采用SOMAscan 7k平台进行。
用于区分AD与CN的11分析物特征在ADNI上达到93.5%的准确率,在Ace上达到95.2%。ApoE和BNP蛋白是分类器中最重要的两个贡献者。MCI分类特征表现较差,在ADNI上准确率为65.9%,在Ace验证测试中准确率为51.0%。
与先前基于同一数据集的蛋白质组学研究相比,我们的发现在使用更少的分析物面板的同时,实现了更高的AD分类特异性和敏感性。我们还证实了该特征在不同平台的不同人群中的可靠性和一致性。然而,所探索的血浆蛋白质组学平台不足以确定MCI衰退者与非衰退者。
阿尔茨海默病(AD)是一种进行性神经退行性疾病,以缓慢进展的认知和功能衰退为特征,逐渐导致日常生活最简单任务的能力丧失。由于AD引起的轻度认知障碍(MCI)是AD的第一个可检测临床疾病阶段。MCI通常 preceded by a long pre-clinical period of progressive accumulation of the AD pathological AT[N] (ATN) triad signature of extracellular amyloid-β (Aβ) plaques, intracellular tau-enriched neurofilament tangles (NFT), and neurodegeneration features. Clinical MCI symptoms become noticeable once these pathologies reach abnormally high levels.
MCI阶段可能持续数年,一些患者可能波动、逆转或保持稳定一段时间,尽管他们仍然处于进展为痴呆的高风险中。然而,尽管MCI患者相比认知正常老年人AD风险升高,许多研究表明一些诊断为MCI的个体可能逆转为正常认知,有时 following changes in lifestyle, better management of co-morbidities, or even spontaneously. Thus, the biomarker-based early and accurate diagnosis of MCI, along with the ability to predict progression patterns towards further cognitive decline vs. relative stability or reversal of MCI decliners, may help in enhancing effective personalized management of patients and their treatment outcomes and contribute towards alleviating pressures on healthcare systems.
对早期或症状前诊断的强调 further supported by results from recent randomized controlled trials (RCTs) of two anti-amyloid immunotherapies, which demonstrated maximal benefits in early MCI patients, while trials are also underway to assess these therapies in pre-symptomatic individuals at high risk for AD. It is widely acknowledged that an ideal biomarker should not only exhibit stability in terms of functionality, specificity, sensitivity, and accuracy but also possess simplicity, preferably allowing direct measurements in easily accessible standard biological sources such as plasma, serum, saliva, or urine.
阿尔茨海默病神经影像倡议(ADNI)自2004年运行,总体目标是更深入理解MCI和AD的生物学机制。近二十年来,ADNI公开了在五个不同阶段(ADNI1, ADNI2, ADNI3, ADNI4, and ADNI-GO)收集的广泛临床、认知和遗传数据。先前基于ADNI和其他队列研究的研究已经识别出基于血浆蛋白的特征,使用Luminex xMAP assay,能够区分AD、认知未受损或正常个体(CN)和MCI。AD血液生物标志物(BBMs)的最新进展,如ptau217和Aβ42/Aβ40在临床试验中用于筛查,表明BBMs在临床环境中的前景。蛋白质组学特征,单独或与这些BBMs结合,有望增强临床和病理诊断的诊断 yield,并可能帮助区分潜在的AD生物型。
由于生物技术的快速进步,新的生物标志物检测方法继续以加速的速度扩展。作为一种新兴的高通量和大规模蛋白质定量方法,SOMAscan正在 gaining popularity,但 primarily been evaluated in comparison with platforms like Olink, limited attention to its transferability in relation to some broadly used platforms, such as Luminex xMAP technology. Given that multiplex immunoassays have long been used in proteomics studies to measure biomolecular interactions on a high-throughput, high-content platform, further investigation into the transferability and limitations of the SOMAscan and other emerging assays, compared with traditional multiplex immunoassays, would aid in coherently understanding proteomic patterns in dementia studies. Cross-platform correlation of protein measurements remains unclear for some technologies, and for some technologies it remains a challenge. While most studies assessing cross-platform transferability focus on evaluating protein measurement correlations, few have directly validated transferability by building predictive models based on data from one platform and testing them on another.
在本研究中,我们旨在开发一种方法,选择和利用超越核心AD病理的小蛋白面板,以增加AD的诊断特异性和敏感性,以及另一个允许区分进展为痴呆阶段的MCI患者与保持稳定或逆转的患者的面板,使用在ADNI中进行的多重检测的 targeted panels。此外,我们旨在通过在不同平台和人群上测试这两个模型,验证我们模型和特征的稳健性,使用SOMAscan数据在独立样本,Ace CSF队列上。
本研究中使用的蛋白质组数据是“生物标志物联盟血浆蛋白质组学项目RBM多重数据”数据集, derived from the Biomarkers Consortium “Use of Targeted Multiplex Proteomic Strategies to Identify Plasma-Based Biomarkers in Alzheimer's Disease” project, a subset from ADNI. This dataset encompasses 190 plasma analytes, collected from 566 participants at baseline and 12?months after, chosen based on their relevance for deciphering diseases such as cancer, cardiovascular diseases, metabolic disorders, and inflammation. Details are available in the ‘Biomarkers Consortium ADNI Plasma Targeted Proteomics Project-Data Primer’ file. The demographic data and baseline clinical assessment were obtained accordingly from the ADNI database.
所有研究人群的特征使用单因素方差分析(ANOVA)进行比较。
本研究的566名受试者在基线时被ADNI根据其诊断方案分类为CN(n=58)、 prevalent MCI(n=396)或 prevalent AD(n=112)。根据ADNI诊断摘要,在记忆抱怨方面,CN受试者没有,而MCI和AD参与者都需要有此类抱怨。MMSE分数范围对于CN和MCI受试者为24至30,对于AD患者为20至26。CDR分数对于CN个体设置为0,对于MCI参与者设置为0.5,要求其记忆盒分数至少为0.5。AD受试者分数在0.5和1之间。对于记忆评估,使用了来自Wechsler Memory Scale–Revised的逻辑记忆II子量表的 one paragraph的延迟回忆(最高分:25)。正常受试者的 cutoff 因教育水平而异:教育16年以上的分数为9+,8-15年的为5+,0-7年的为3+。对于MCI和AD参与者, cutoff 分数对于教育16年以上的为8或以下,8-15年的为4或以下,0-7年的为2或以下。
基于长达198个月(平均4.19±3.21年)的随访,我们将MCI参与者分类为MCI逆转者(逆转为CN并保持直到随访结束,N=21)、MCI稳定者( consistently diagnosed as MCI or occasionally as CN or AD but back to MCI before follow-up ended,N=158)和MCI衰退者( declined and remained as AD dementia until follow-up ended,N=217)。
所有参与者的血浆分析物水平使用由Rules-Based Medicine(RBM)开发的Human Discovery Map panel在Luminex xMAP平台上量化。70名参与者的值在12个月时缺失。异常值(>5 SD)被替换为最近的 non-outlier values。作为生物标志物联盟项目的一部分, analyte distributions within each diagnosis group were transformed to approximate normality (using Box-Cox transformations) and then z-scored. Ultimately, 146 of the 190 analytes passed quality control and were included in our analysis.
基于先前相关出版物,考虑了几个模型用于我们的分析。我们采用了一个两步过程来生成我们的最终模型:首先,如下所述选择特征,然后将分类器应用于所选特征集。
在我们的初始特征选择步骤中,并为提供单分析物与AD状态关联的概述,对每个分析物进行逻辑回归, with covariate adjustments (age, sex, APOE ε4 genotype [rs429358]), to assess their associations with AD patients and CN participants. Multiple testing was corrected for using a Bonferroni correction for the effective number of tests, according to the correlation matrix of all analytes from the PoolR package.
我们应用弹性网络方法 together for its proven efficacy in handling multicollinearity, as a compromise between the two regression methods, LASSO and Ridge. Of note, elastic net has been gaining increasing popularity by researchers in AD proteomics for feature selection and classifier development. We also employed random forest and na?ve bayes, as these machine learning algorithms have consistently demonstrated effectiveness in distinguishing between AD patients and CN individuals based on proteomic data. All analytes and covariates (age, sex, APOE ε4 genotype) were used in the model training.
旨在优化特征选择和模型稳定性,随机森林和弹性网络各在AD/CN子集上运行100次。出现在超过50%的随机森林模型和所有弹性网络模型中的分析物被选择用于分类。
对于AD和CN参与者分类,我们应用随机森林、朴素贝叶斯和弹性网络来自R中的caret包,使用所选特征,对AD和CN参与者进行分类。这三种方法应用于特征LR、EN、RF和CM,构建了12个模型。
对于MCI亚组分类,仅使用弹性网络进行特征选择、降维和分类,因为它是上述AD与CN分类中性能最好的模型。
考虑到相对较小的样本量,我们使用交叉验证方法验证模型:通过随机将170名参与者(112名AD和58名CN)分成训练(N=136)和测试(N=34)集10次,我们每次训练和测试分类器,并计算它们在10个数据集上的平均准确率、特异性和敏感性。所选模型也应用于12个月的随访数据进行时间验证,并应用于MCI数据集以探索特征区分MCI稳定者和衰退者的能力。
分析在R(版本4.3.2)中使用关键包glmnet(v4.1.8)和caret(v7.0.1)进行。
Ace CSF队列数据集被用作本研究中的独立验证队列。Ace Alzheimer Centre Barcelona成立于1995年,作为阿尔茨海默病和相关痴呆的研究和临床卓越中心。Ace Alzheimer Center Barcelona已经建立了一个全面的队列,旨在推进对阿尔茨海默病(AD)和其他痴呆的理解、诊断和治疗。该队列作为研究倡议的资源,特别是在生物标志物发现和验证方面,用于脑脊液(CSF)AD生物标志物、血浆生物标志物,并探索晚期发作痴呆的遗传方面。
本研究中使用的Ace子集包含1303名参与者(113名CN/795名MCI/395名 prevalent AD), derived based on the selection of protein data included in the final model from ADNI. The syndromic diagnosis of all subjects was established by a multidisciplinary group of neurologists, neuropsychologists, and social workers, based on their clinical protocols. Plasma samples was obtained following the consensus recommendations, and a subset of these were analyzed using the SOMAscan 7k proteomic platform (SomaLogic, Boulder, CO, US) and processed according to standard Somalogic procedures. 7307 plasma analytes were available, including all the classification analytes that were selected from the ADNI cohort.
预处理(异常值处理、转换和缩放)使用与ADNI相同的步骤进行。
从ADNI数据集识别的分析物血浆水平,连同协变量(年龄、性别和APOE ε4状态),应用于Ace队列。分类器的准确率通过比较它们的预测与实际诊断类别或MCI亚组来评估,评估它们在区分这些组中的精确度。
在ADNI数据集中,诊断组之间在年龄、性别和教育方面没有显著差异。APOE ε4基因型(p<0.001)和MMSE(p<0.001)在诊断组之间显示显著差异。对于MCI亚组,没有观察到显著差异(所有特征p>0.1)。
使用协变量调整(年龄、性别、APOE ε4基因型)的逻辑回归评估单分析物与AD诊断的关联。7个分析物其p值通过Bonferroni校正(alpha=0.0006)被选择用于特征LR。
15和11个分析物分别由弹性网络和随机森林选择用于特征EN和特征RF。特征CM包含特征LR、EN和RF的4个共同分析物。特征CV仅包括协变量以评估额外分析物的包含是否增强分类器性能。基于这些特征的不同分类器然后被应用,并比较了不同模型在10个随机生成的测试集上的平均准确率。特征EN和弹性网络分类器获得了最高的准确率和最强的性能。
虽然弹性网络选择在不同运行中变化,但当面板大小固定时,一致的分析物出现。基于它们在测试集上的出现频率和面板性能,一个11分析物面板(ApoAII, ApoB, IL16, ApoE, Vitronectin, BNP, TTR, IL6r, PYY, SGOT, and MIP1a)被选为最终特征, due to its highest occurrence frequency and high accuracy on AD vs. CN classification. 在相同的10个测试集上,11分析物特征在AD与CN分类上比上述15分析物特征具有更高的平均准确率(86.5% vs. 86.3%)和特异性(82.9% vs. 81.1%),但敏感性略低(90.1% vs. 91.5%)。与仅协变量特征(准确率74.4%)相比,11分析物特征(86.5%)也 substantially improved disease classification.
为了评估我们的11分析物特征在应用于重复生物标志物测量(可能随时间变化)时的准确性,我们将模型应用于247名被诊断为AD/CN并在12个月后测量了血浆分析物水平的ADNI参与者。准确率为72.60%(敏感性81.65%,特异性49.18%),表明时间变化对分类器准确性的影响。
11分析物模型以59.2%的准确率分类MCI衰退者与稳定者,优于仅协变量模型(53.9%)。然后我们探索开发一个特定于MCI转换的生物特征是否可以提高准确率。在MCI参与者上训练的弹性网络特征选择和分类(如上所述)产生了一个17分析物特征。17分析物的准确率为58.9%,因此与11分析物模型相比没有显示改进。
在Ace研究中,诊断组之间在年龄(p<0.001)、性别(p=0.004)和APOE ε4基因携带者状态(p=0.016)方面存在显著差异。对于MCI衰退者与稳定者,在年龄(p<0.001)和APOE ε4状态(p<0.001)方面仍然可以观察到显著差异,但性别(p=0.15)没有。795名MCI参与者的平均随访时间为2.6年,标准差为1.6年。没有MCI患者在随访期间逆转为CN状态,这可能归因于 ascertainment methods的差异。与ADNI不同,ADNI参与者通过更广泛的社区 based strategies招募,Ace CSF队列更选择性地由寻求医疗关注的个体组成, potentially leading to a lower likelihood of identifying MCI reverters.
我们还利用从ADNI MCI分类生成的17分析物特征用于识别MCI稳定者和MCI衰退者。准确率仅为51.0%,但仍然高于仅协变量特征(43.5%)。尽管队列特征存在显著差异,这表明分析物可能提供超越协变量的额外信息,尽管性能仍然有限。
为了解释ADNI和Ace之间的队列和平台差异,我们评估了它们对分析物-诊断关联的影响。因此,我们采用协变量调整的逻辑回归分别分析每个分析物与诊断的关联,在ADNI和Ace队列中。11个分析物中的8个在队列中显示相同的效应方向。BNP、PYY和ApoB confer comparable effect sizes, and their contributions to the classification model are relatively high. While MIP1a, ApoAII, and ApoE showed opposite directions of effect, their contribution to the classification model was smaller in comparison to that of the others. These discrepancies may stem from differences in measurement methodologies between the SOMAscan and RBM platforms, as previous studies have reported lower correlation for some proteins across these technologies. Overall, 7 out of 11 analytes in ADNI and all analytes in Ace exhibited p-values less than 0.05 in these univariate analyses, indicating their significance in distinguishing between AD and CN.
用于分类MCI稳定者和MCI衰退者的17分析物的对数比值比在两个队列之间显示 substantial differences。17个分析物中的8个显示相反的效应方向,并且它们对模型的贡献相对高于其他分析物。此外,相对较大的标准误差— consistently include zero—以及有限数量的分析物其p值 below 0.05(ADNI中6个,Ace中1个)可能 further contribute to this reduced accuracy.
在本研究中,我们开发了一个包含11种血浆生物标志物和协变量(性别、年龄和APOE ε4基因型)的分类模型,并在此证明在ADNI队列中,AD和CN分类的交叉验证预测准确率约为94%,敏感性和特异性分别为94%和88%。最近的研究显示血浆p-tau217和基于ATN的标志物(CSF Aβ42/Aβ40, tau-PET)实现AUCs >0.9用于AD诊断。类似地,在ADNI队列中基于ATN的分类,使用CSF Aβ42/Aβ40(AUC=0.84)和tau-PET(AUC=0.92),也 demonstrated strong performance. 我们的模型—尽管没有 incorporate ATN plasma biomarkers— achieved comparable accuracy, suggesting that this set of plasma analytes and covariates may offer an additional exploratory perspective for a better understanding of biological mechanisms underlying the development of AD and, thus, help identify novel biological pathways or systemic processes relevant to disease development and progression. Furthermore, when applied to the independent validation Ace cohort, the performance of the 11-analyte model had an accuracy of 95%, despite the analytes being measured through a different analytical proteomic platform. However, the MCI classification model, containing covariates and 17 analytes, performed less well, achieving a 10-fold cross-validated accuracy around 66% (specificity 40%, sensitivity 85%) in the ADNI cohort and did not show any predictive accuracy in the Ace cohort (51%).
我们的AD/CN分类模型, developed on analytes measured at baseline, exhibited diminished performance when applied to the month 12 measurements in ADNI, dropping to 73.6%. This limited temporal stability of the signature might be attributed to the fact that the concentration of substances in plasma is influenced by a variety of modifiable factors such as lifestyle, nutrition, and infections; and should, therefore, not be solely attributed to alterations in brain pathophysiology. The model may also be overfitted to baseline patterns, limiting its generalizability over time. However, some individuals classified as CN clinically may harbor brain pathology consistent with more advanced AD clinical stages; thus, the model may indeed detect the true “biological signals”.
MCI分类模型在ADNI队列中 achieved, a cross-validated accuracy of 58% in distinguishing MCI decliners from MCI stable. When applying the 17-analyte model to the Ace cohort, the accuracy was only 51%. This poorer performance may be related to the nosological heterogeneity of MCI, the influence of unaccounted confounding factors, and/or the presence of overlapping biological processes between MCI stable and decliners. Recent studies also suggest that plasma biomarkers alone may offer limited utility in distinguishing progressive from stable MCI. Reported AUCs for predicting MCI-to-AD conversion using plasma-based models typically range from 0.64 to 0.70, indicating suboptimal performance for clinical application. In contrast, transcriptomic signatures from peripheral blood have achieved higher accuracy (AUC >0.90) in differentiating progressive from stable MCI, and multimodal approaches incorporating neuroimaging or genomics may offer more robust alternatives. Meanwhile, even identifying MCI from AD or CN remains challenging across different biomarker-based models. While plasma-based ATN biomarkers significantly improved AD conversion prediction over a basic model (age, sex, education, AUC=0.64), even the best-performing model incorporating multiple plasma biomarkers achieved an AUC of only 0.82, suggesting that plasma biomarkers alone may not fully capture the heterogeneity of MCI both for identifying MCI from AD or CN and subtyping MCI. Additionally, the lack of discernible differences in covariates between the two groups within the ADNI dataset also resulted in diminished accuracy of classification models. Consequently, when this model is applied to categorize a new cohort with analogous analyte distributions but disparate covariate distributions, the accuracy is likely to be further compromised. Additionally, when comparing MCI subgroups based on a follow-up period of up to 48?months (mean 2.8?±?1.5?years) in ADNI, comparable to the Ace cohort, to those based on up to 198?months in ADNI participants, we observed that 10.9% of participants shifted between MCI subgroups, notably among MCI stable and reverters. This suggests that participant classifications in the Ace cohort may not fully capture their cognitive trajectories, potentially affecting the accuracy of MCI decliner and stable classifications in our study.
在本研究中,玻连蛋白是唯一被发现既是AD和CN状态的潜在预测因子,也是MCI向痴呆进展的预测因子的分析物。尽管其系数较低,玻连蛋白,一种细胞外环境的调节剂,已涉及炎症和淀粉样蛋白相关沉积,尽管其在AD中的作用仍不清楚。
AD-CN特征和MCI特征之间有限的重叠可能源于多个因素。Crane et al. noted statistically significant differences in measurement precision across cognitive domains, and others suggested that 34.2% of MCI were “false positive” (CN clinically misclassified as MCI) and 7.1% were false negatives (MCI misclassified as CN) within the ADNI dataset. In spite of these observations, we adopted the ADNI “formal” diagnosis for our analyses. Furthermore, the analytes selected for the AD-CN signature reflect markers of advanced disease stages that may not be detectable in MCI patients, especially those in the early stages or with minimal cognitive decline.
几位研究人员先前尝试在ADNI中分类AD和CN参与者,使用血浆蛋白质组学特征。与先前研究相比,我们的特征使用相同数据集实现了更高的特异性和敏感性,尽管使用更小的面板。此外,我们的模型在利用不同蛋白质组学平台的独立队列上得到验证, further corroborating the robustness and generalizability of our findings.
在我们和先前蛋白质组学特征用于AD-CN分类的组成部分中, common proteins are ApoE (Apolipoprotein E) and BNP (B-type natriuretic peptide). Previous studies have suggested that the ability of plasma ApoE to distinguish between AD and CN individuals is primarily driven by APOE ε4 genotype. However, in our analysis, ApoE was still selected even after adjusting for APOE genotype, although its coefficient in the model was lower than the average. This could be due to the residual predictive value of ApoE beyond genotype or the population heterogeneity. ApoE is a lipid transporter which is involved in various cellular functions, such as neural signal transmission and neuroinflammation. BNP primarily regulates circulation and vascular function. The exact mechanism relationship between BNP and AD, as well as its potential as an AD biomarker, warrants further future research.
在11分析物分类模型中,区分AD和CN,BNP被发现表现出最大的贡献,并在ADNI和ACE队列中显示与AD诊断相似且显著的关联。尽管机制仍不清楚,已有 several reports of an association between elevated BNP levels and increased AD risk, both in in vivo cohort studies and post mortem neuro-pathological studies. 对于MCI分类,肌红蛋白是我们研究中效应最强的分析物。肌红蛋白是少数在Ace中复制与MCI分类关联的分析物之一。它是一种细胞质 hemoprotein capable of reversibly binding oxygen (O2) and releasing it during periods of hypoxia or anoxia. It may serve as a scavenger of reactive oxygen species (ROS) within cells, mitigating oxidative stress. Numerous clinical trials have shown that altering brain oxygenation intermittently can effectively enhance short-term memory and attention in elderly individuals with amnestic MCI (aMCI) and improve cognitive function.
我们的特征基于弹性网络,一种广泛用于高维数据分析的统计建模和机器学习方法。其特征选择过程完全是数据驱动的,以全面捕获与AD相关的所有相关分析物。鉴于人类遗传产物之间普遍存在的相关性,弹性网络可以通过采用L1-norm penalty有效缓解分析物之间共线性引起的问题,并具有生成具有最有影响力特征的稀疏模型的额外优势。此外,降维以最小影响准确率减少了面板大小。在Ace队列中的外部验证 further supported the model's generalizability across platforms. However, our model exhibited lower performance in classifying MCI converters using a similar approach. Limitations include outdated analyte selection, diagnostic variability, and limited platform validation. Future studies should expand analyte coverage, harmonize diagnostic criteria, and explore cross-platform reproducibility.
总之,我们的结果证实了一组血浆分析物在不同平台间的可靠性和一致性,以及它们在有助于更精确和个性化表征AD个体方面的潜在效用。这可能在未来药物研究和开发努力中具有价值, towards more effective and personalized new therapies for cognitive decline and dementia due to AD. 在本研究中,使用弹性网络,我们开发并 externally validated a precise and sensitive biomarker signature for AD identification, which may potentially have value as an additional plasma biomarker-based and data-driven tool in classifying disease status. We further highlight