Metabolomics Combined with Machine Learning LASSO Regression to Identify Differential Biomarkers between Taigu Yam and Tiegun Yam
-
摘要: 本研究旨在利用代谢组学筛选出太古山药和铁棍山药之间的差异代谢物,并通过LASSO回归机器学习方法确定作为预测不同山药品种的差异标志物。研究采用超高效液相色谱-四级杆飞行时间串联质谱(ultra-performance liquid chromatography-quadrupole time of flight tandem mass spectrometry,UPLC-Q-TOF-MS/MS)分析两种山药中的代谢物,通过主成分分析(principal component analysis,PCA)和正交偏最小二乘判别分析(orthogonal partial least squares-discriminant analysis,OPLS-DA)识别出两种山药中的差异代谢物,利用最小绝对收缩和选择算子(least absolute shrinkage and selection operator,LASSO)回归方法筛选出差异性标志物,建立用于品种鉴别的预测模型。结果显示,在两种山药中共鉴别出206种代谢物,PCA分析发现太谷山药和铁棍山药之间区分明显,OPLS-DA进一步筛选出56种存在显著性差异的代谢物。基于这些差异代谢物进行LASSO回归分析,得到ophiogenin 3-O-beta-L-rhamnopyranosyl-beta-D-glucopyranoside、天冬氨酸、表儿茶素没食子酸酯、夏佛塔苷以及没食子儿茶素5种关键差异标志物,建立了用于太谷山药和铁棍山药品种鉴别的LASSO回归预测模型。本研究基于代谢组学和LASSO回归机器学习方法,识别出太谷山药和铁棍山药的差异标志物,构建了不同品种山药的预测模型,为山药的鉴别提供了新的思路。Abstract: This study aimed to screen out the differential metabolites between Taigu yam and Tiegun yam by metabolomics approach, and determine differential markers for predicting different yam varieties through the east absolute shrinkage and selection operator (LASSO) regression method. Ultra-performance liquid chromatography-quadrupole time of flight tandem mass spectrometry (UPLC-Q-TOF-MS/MS) was employed to analyze metabolites in two types of yams. Principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) were applied to identify distinct metabolites in these two yam varieties. Additionally, the LASSO regression method was used to screen out differential markers and establish a prediction model for variety identification. The results showed that a total of 206 metabolites were identified in the two yams. PCA found that Taigu yam and Tiegun yam were clearly distinguished. OPLS-DA further screened out 56 differential metabolites. LASSO regression analysis was performed based on these differential metabolites, and five differential markers were obtained including ophiogenin 3-O-beta-L-rhamnopyranosyl-beta-D-glucopyranoside, aspartic acid, epicatechin gallate, schaftoside and gallocatechin. These differential markers were used to establish a LASSO regression prediction model for identification of Taigu yam and Tiegun yam varieties. Based on metabolomics and LASSO regression methods, this study identified differential markers between Taigu yam and Tiegun yam, constructed a prediction model to identify different yam varieties, and would provide new ideas for the identification of yam.
-
Keywords:
- metabolomics /
- machine learning /
- Taigu yam /
- Tiegun yam /
- differential markers
-
山药,来源于薯蓣科植物薯蓣(Dioscorea opposita Thunb.)根茎,是中国传统的药食同源植物,药用历史悠久,其气味平和,温补而不骤,微香而不燥,具有健脾、补肺、固肾、益精之功效。《本草蒙筌》记载“南北州郡俱产,唯怀庆府(今河南焦作境内)者良”。产自于古怀庆地区(今河南省焦作市温县、武陟县、沁阳市等沿沁河一带)的山药,质坚实、粉性足、色洁白而誉满古今中外[1],其疗效肯定,属于中医临床治疗中常用的大宗药材[2−4]。近年来的科学研究表明山药中含有丰富的活性成分,包括氨基酸、糖类、多酚、皂苷、尿囊素和黄酮等[5−6]。这些成分赋予山药多方面的生物活性,如调节免疫、抗炎、益肠胃、调节糖脂代谢、抗氧化等功能[7−9]。
太谷山药和铁棍山药作为山药中的两个重要品种,在外观、营养价值和药用价值上各有特点。外观上,太谷山药表皮粗糙,断面白色、细腻,质脆易断;而铁棍山药表皮粗糙、有红斑,须根长而密,断面细白,质地坚实。营养价值和药用方面,太谷山药通常含有更高水分,质地细腻,其种植面积大,主要加工成药材,是加工毛山药和光山药的原料;铁棍山药常含有更高的淀粉、氨基酸和粗蛋白等,质地硬实,因其产量低,种植面积小,主要作为保健食品。品种是影响山药营养品质的重要内在因素。在目前的研究文献中,关于太谷山药和铁棍山药成分差异的报道相对有限。腊贵晓等[10]在正离子模式下进行了山药质谱数据的采集,并通过主成分分析鉴别了植酸、麦角甾醇和豆固醇等三种差异性成分,然而,该研究并未对这两种山药含有的化学成分进行全面系统化分析。代谢组学在药食同源研究领域的应用日益受到关注,它不仅能够系统地分析药食同源类植物体内代谢产物的整体概况,还能够揭示不同品种、不同生长环境下药食同源植物的代谢物差异[11−12]。同时,机器学习作为人工智能的关键的一部分,能够从大量数据中挖掘关键信息构建预测模型[13−14]。通过机器学习算法,可以更准确地识别代谢组学数据中的关键特征和模式,实现高效和可靠的鉴别[15−17]。
本研究拟采用超高效液相色谱-四级杆飞行时间串联质谱(ultra-performance liquid chromatography-quadrupole time of flight tandem mass spectrometry,UPLC-Q-TOF-MS/MS),对太谷山药和铁棍山药的化学成分进行了全面分析。此外,结合代谢组学和机器学习中的最小绝对收缩和选择算子(least absolute shrinkage and selection operator,LASSO)回归方法,以筛选并确定这两种山药之间关键的差异性标志物。基于这些标志物,本文将构建一个旨在准确鉴别这两种山药的可靠预测模型。本研究不仅为深入探索太谷山药和铁棍山药之间的特性差异提供坚实的科学依据,也为山药的有效利用和开发提供强有力的支持。
1. 材料与方法
1.1 材料与仪器
太谷山药、铁棍山药 于2022年3月12日种植在温县农业科学研究所怀山药试验基地,2022年11月25日收获;甲醇、乙腈 色谱纯,均购于美国默克公司;甲酸、甲酸铵 色谱纯,上海安谱实验科技股份有限公司。
TripleTOF X500R超高效液相色谱-四极杆飞行时间质谱联用仪 美国SCIEX公司;Milli-Q Synthesis超纯水机 美国Millipore公司;KQ-3200DE超声波提取器 昆山超声仪器有限公司;GIPP-DCY-12Y水浴氮吹仪 上海继谱电子科技有限公司。
1.2 实验方法
1.2.1 样品制备
将山药样本洗净晾干,切片后迅速置于液氮中研磨成粉。称取山药粉0.5 g,加入75%甲醇水溶液5 mL,涡旋30 s,20 kHz超声提取45 min,12000 r/min离心10 min,吸取上清液,氮吹吹干后用2 mL 甲醇复溶,过0.22 µm滤膜。质控(Quality Control,QC)样本是将复溶过滤后的两种山药样本进行等量均匀混合,两个品种山药提取液和QC样本各取10份,在采集数据时,QC样本与试验样本穿插上机,开始和结尾各三个QC,中间每四个样本穿插一个QC样本。
1.2.2 色谱条件
色谱柱:T3柱(2.1 mm×100 mm,1.7 μm);流动相:A为纯水+0.02%甲酸,B为甲醇:乙腈=1:1;梯度洗脱程序:0~1 min,5%B;1~8 min,5%~98% B;9~18 min,98% B;18~18.1 min,98%~5% B;18.1~20 min,5% B;流速:0.3 mL/min,进样体积:5 μL。
1.2.3 质谱条件
离子源采用电喷雾离子源,离子源参数设置:离子源电压喷射为+5500 V/−4500 V,离子源温度为500 ℃,雾化气流速为50 psi,辅助气流速为55 psi,气帘气流为35 psi,捕获电压为80 V,碰撞电压为(40±20) V,一级质谱m/z扫描范围设为80~1500 Da,二级质谱扫描范围设为50~1500 Da。扫描模式:正离子和负离子。
1.2.4 代谢物鉴别和组学数据分析
采用SCIEX OS软件进行数据采集和数据处理,结合天然产物高分辨质谱二级数据库SCIEX TCM Library,设置一级质量偏差小于5 ppm,谱库匹配得分大于80,对样品中的主要代谢物成分进行鉴别,将得到的化合物准分子离子、质谱裂解途径等信息结合METLIN(https://metlin.scripps.edu)和ChemSpider(http://www.chemspider.com)等公共数据库进行最终确认。使用R软件(v4.2.2)进行主成分分析(Principal component analysis,PCA)和正交偏最小二乘判别分析(Orthogonal partial least squares-discriminant analysis,OPLS-DA)。其中PCA展示样本间的相似性和差异性,OPLS-DA用于找出太谷山药和铁棍山药间的差异代谢物。
1.2.5 机器学习模型构建
本研究以随机选择太谷山药和铁棍山药各8批次检测代谢组学结果作为训练集,剩余各2批次用作预测集,采用机器学习中的LASSO回归分析方法筛选出两种山药的关键性差异标志物,建立能够区分太谷山药和铁棍山药的预测模型。对于筛选出的关键性代谢物,使用如下模型进行预测两个山药品种:
模型得分=β0+β1×X1+β2×X2+⋯+βn×Xn 其中,β0是截距项;β1,β2,…,βn是代谢物的权重系数;X1,X2,…,Xn是输入特征。通过机器学习LASSO回归算法获得计算权重系数,找出区分太谷山药和铁棍山药样本的差异标志物。
为了验证所构建的预测模型的性能,进一步采用了接受者操作特征(Receiver operating characteristic curve,ROC)曲线进行模型的准确性和可靠性评估。
2. 结果与分析
2.1 太古山药和铁棍山药的代谢物总体分析
根据山药样本检测信息,利用天然产物数据库SCIEX TCM Library和代谢物信息公共数据库对代谢物进行识别,共鉴别出代谢物206种,其中脂质类最多,有31种,黄酮类有30种,其次氨基酸及其衍生物24种,酚酸类21种,有机酸类19种,生物碱6种,酯类和皂苷类均为5种,维生素类有4种,核苷酸类、萜类、香豆素类和糖类分别为3种,酰胺类和木脂素类分别为2种,其他类45种。太古山药和铁棍山药代谢物种类的具体分布情况见图1。
2.2 太古山药和铁棍山药的主成分分析
本研究采用了无监督的PCA来反映所有样本的整体分布情况。如图2所示,PCA得分图显示,第一主成分(PC1)贡献率为45.2%,第二主成分(PC2)贡献率为14.3%,两者的累计贡献率为59.5%,太谷山药和铁棍山药在第一和第二主成分上呈现出显著的分布差异,PCA结果表明两个山药品种的代谢物差异明显。由于两种山药品种在相同环境下生长,样本之间的显著分布差异可能与它们的品种特性相关。此外,在主成分分析中,质量控制QC样本被观察到紧密聚集,表明试验的重复性和仪器的稳定性均良好。
2.3 太古山药和铁棍山药中差异代谢物筛选
为更好地获取组间差异信息,采用有监督的OPLS-DA模型对太谷山药和铁棍山药进行差异代谢物分析。图3A为OPLS-DA模型的得分图,结果显示太谷山药和铁棍山药存在明显区分。图3B为置换检验图,有效性验证显示该OPLS-DA模型有效且可靠,该模型能够用于差异代谢物的筛选。根据OPLS-DA的变量重要性投影(variable importance in projection,VIP)同时结合单维分析的变化倍数(fold change,FC)对差异代谢物进行选择。结果显示筛选出的VIP≥1且|log2FC|>1的差异代谢物共56种,包括黄酮类14种,酚酸类13种,脂质类6种,氨基酸及衍生物4种,酯类4种,有机酸3种,核苷类2种,酰胺类2种,香豆素类1种,维生素类1种,萜类1种及其他类5种,具体信息见表1。
表 1 太古山药和铁棍山药中差异代谢物相关信息Table 1. Information on differential metabolites between Taigu yam and Tiegun yam分类 化合物名称 加合方式 MS MS/MS 分子式 保留时间(min) log2FC VIP 黄酮类 5,7-dihydroxy-2-(4-hydroxyphenyl)-6,8-bis (3,4,5-trihydroxyoxan-2-yl) chromen-4-one [M+H]+ 535.1450 415.1032,295.0601 C25H26O13 5.75 −11.88 3.20 异鼠李素 [M-H]− 315.0513 300.0273,164.0115 C16H12O7 8.69 −10.05 2.81 香橙素 [M-H]+ 287.0565 165.0181,153.0186 C15H12O6 7.03 −2.62 1.49 异鼠李素-3-O-葡萄糖苷 [M-H]− 477.1047 315.0496 C22H22O12 6.44 −1.81 1.22 L-谷胱甘肽 [M-H]− 611.1458 306.1223,272.114 C20H32N6O12S2 6.22 −1.51 1.11 夏佛塔苷 [M-H]− 563.1408 473.1080,365.0659 C26H28O14 5.17 −1.45 1.08 牡荆素鼠李糖苷 [M-H]− 577.1565 431.0972,311.0543 C27H30O14 5.84 −1.45 1.08 氧杂蒽酮I [M+H]+ 409.1649 353.1018,335.0910 C24H24O6 1.13 −1.24 1.01 异甘草素 [M-H]− 255.0664 119.0496,91.0491 C15H12O4 7.60 1.63 1.17 柳穿鱼黄素 [M-H]− 313.0720 298.0455,284.0261 C17H14O6 9.39 2.14 1.34 黄腐醇 [M-H]− 353.1377 233.0786,119.0514 C21H22O5 6.89 −1.74 1.21 川陈皮素 [M+H]+ 403.1400 387.1044,373.0962 C21H22O8 10.24 −1.63 1.12 异鼠李素-3-O-新橙皮苷 [M-H]− 623.1639 315.0513,299.0196 C28H32O16 6.13 3.54 1.55 Sophoraflavonoloside [M+H]+ 611.1604 448.1007,287.0551 C27H30O16 10.98 −1.52 1.10 酚酸类 没食子儿茶素 [M-H]− 305.0665 165.0191,137.0243 C15H14O7 4.26 −13.42 3.40 羟基酪醇 [M-H]− 153.0556 135.0449,121.0293 C8H10O3 3.96 −3.79 1.80 肉豆蔻木脂素 [M+Na]+ 397.1621 274.3724 C21H26O6 8.33 −1.82 1.23 原儿茶酸 [M-H]− 153.0194 135.0083,109.0312 C7H6O4 3.82 −1.67 1.14 3-Methoxy-4-hydroxyphenyl 6-O-(3,4,5-trihydroxybenzoyl)-beta-D-glucopyranoside [M+H]+ 455.1538 437.1078,315.0711 C20H22O12 5.32 −1.64 1.13 山药素 IV [M+H]+ 245.1177 151.0755,137.0599 C15H16O3 7.72 −1.59 1.15 山药素III [M+H]+ 245.1177 151.0755,123.0804 C15H16O3 9.78 −1.31 1.03 3'-O-甲基山药素III [M+H]+ 259.1333 241.1223 C16H18O3 11.21 2.56 1.45 迷迭香酸 [M-H]− 359.2090 197.0452,179.0345 C21H30NO4 6.58 6.59 2.01 迷迭香酚 [M-H]− 345.1707 301.1807 C20H26O5 6.83 7.83 1.70 原花青素B2 [M-H]− 577.1359 425.0715,271.0436 C30H26O12 4.44 3.41 1.69 (+)-儿茶素 [M-H]− 289.0720 151.0403,109.0241 C15H14O6 5.19 13.81 3.45 双没食子酸 [M-H]− 322.0325 303.0461,260.0312 C14H10O9 14.00 −1.25 1.0 脂质类 1-(9Z-Octadecenoyl)-sn-glycero-3-phosphoethanolamine [M+H]+ 480.3095 265.2523,44.0496 C23H46NO7P 12.83 −6.91 1.83 亚油酸 [M-H]− 279.2331 261.2219,59.0138 C18H32O2 14.29 −2.17 1.35 十八烷二酸 [M-H]− 313.2385 295.2279,269.2487 C18H34O4 11.27 −4.61 1.98 油酸 [M-H]− 281.2488 263.2072,111.0673 C18H34O2 15.18 −1.62 1.12 9,12,15-十八碳三烯酸 [M+H]+ 515.3215 497.3134,335.2583 C27H46O9 13.38 1.75 1.19 植物鞘氨醇 [M+H]+ 318.3005 300.2890,282.2786 C18H39NO3 11.39 4 1.36 氨基酸及衍生物 泛酸 [M-H]− 218.1032 146.0826,88.0412 C9H17NO5 3.57 −1.54 1.14 组氨酸 [M-H]− 154.0621 136.9023,93.1002 C6H9N3O2 1.18 4.11 1.87 α-氨基丁酸 [M+H]+ 104.0701 87.0442,69.0339 C4H9NO2 1.27 5 2.07 天冬氨酸 [M-H]− 132.0300 115.0027,88.0397 C4H7NO4 1.19 15.37 3.64 酯类 迷迭香酸甲酯 [M+H]+ 375.1290 315.0871,163.0392 C19H18O8 1.18 −5.92 1.73 β-谷甾醇乙酸酯 [M+H]+ 457.4027 415.3862,397.3840 C31H52O2 12.07 −1.45 1.09 表儿茶素没食子酸酯 [M+H]+ 443.0969 139.0389,123.0441 C22H18O10 4.01 3.37 1.70 亚油酸甲酯 [M+H]+ 295.2633 263.2369 C19H34O2 16.05 1.32 1.04 有机酸类 对羟基肉桂酸 [M-H]− 163.0402 145.0296,119.0501 C9H8O3 7.27 8.89 2.42 对羟基苯甲酸 [M-H]− 137.0244 93.0343,59.1023 C7H6O3 7.15 9.22 2.69 异柠檬酸 [M-H]− 191.0193 147.0295,117.0188 C6H8O7 1.78 −3.65 1.77 核苷类 5'-甲硫腺苷 [M+H]+ 298.0967 163.0398,136.0596 C11H15N5O3S 4.33 −2.68 1.08 鸟苷 [M-H]− 282.0846 150.0423 C10H13N5O5 1.90 5 2.07 酰胺类 N-反式-对香豆酰酪胺 [M+H]+ 284.1283 147.0441,138.0915 C17H17NO3 7.32 1.73 1.17 N-油酰乙醇胺 [M+H]+ 326.3063 62.0596 C20H39NO2 14.06 2.22 1.30 香豆素类 氧化前胡素 [M-H]− 285.0774 201.0186,157.0290 C16H14O5 7.87 1.62 1.17 维生素类 维生素D2 [M+H]+ 397.3464 99.5113 C28H44O 16.54 −1.49 1.11 萜类 Ophiogenin 3-O-beta-L-rhamno pyranosyl-beta-D-glucopyranoside [M+H]+ 755.4238 575.3577,429.3013 C39H62O14 6.12 14.08 3.48 其他类 3-吲哚甲醛 [M+H]+ 146.0597 118.0690,91.0560 C9H7NO 4.15 −4.72 1.51 2,7-二羟基-4-甲氧基-9,10-二氢菲 [M+H]+ 243.0653 228.0782,211.0745 C14H10O4 6.52 −1.64 1.15 1,8-二羟基蒽醌 [M-H]− 239.0352 211.0401,167.0502 C14H8O4 9.86 −1.57 1.15 4-[3,5-dihydroxy-7-(4-hydroxyphenyl) heptyl]benzene-1,2-diol [M+H]+ 333.1704 107.0491 C19H24O5 6.24 −4.21 1.90 尿囊素 [M-H]− 157.0367 114.0302,71.0242 C4H6N4O3 1.30 2.14 1.34 经过对数转换的差异代谢物丰度数据聚类分析热图,如图4所示,其中每一行代表一个特定的化合物,每一列代表一个样本,颜色的变化表示化合物丰度的不同水平。颜色越深,代表丰度值越高;颜色越浅,丰度值越低。从图4中可以看到,铁棍山药中有33个高差异代谢物,黄酮类和酚酸类最多,分别为11种和7种,其中山药特有的活性成分山药素III和山药素IV在铁棍山药中丰度较高。有研究指出,山药素III与山药素IV具有抑制α-葡萄糖苷酶活性的潜力,可能具有降糖作用[18]。此外,文献[19]报道山药素还具有类似植物生长调节的活性。在低浓度下,山药素能够促进种子萌发或植物生长,当浓度超过一定范围时,它可能对植物的生长发育产生抑制作用。相较于铁棍山药,太古山药中存在23种高丰度代谢物,其中包括一些具有生物活性的化合物,如异甘草素、柳穿鱼黄素、异鼠李素-3-O-新橙皮苷、花青素B2、儿茶素、尿囊素、表儿茶素没食子酸酯等。研究表明,异甘草素作为查尔酮类化合物,具有抗炎、抗氧化和改善糖尿病的活性[20];柳穿鱼黄素在抗肿瘤、抗炎、抗氧化等方面也显示出生物活性[21];表儿茶素没食子酸酯和儿茶素则表现出多种生物活性,包括抗炎、改善代谢性疾病、抗氧化、抗肿瘤、保护心血管和神经作用等[22−24];尿囊素在激活咪唑啉受体、提高能量代谢、抗氧化应激、抗炎等方面显示出潜在作用[25−27]。
2.4 机器学习模型筛选差异标志物
机器学习中的LASSO回归是一种用于特征选择和变量筛选的线性回归算法,它在普通线性回归的基础上引入L1正则化项,能够有效地过滤掉对结果影响较小的特征,使这些特征系数为零,消除不相关的特征,特征选取出与样本组别差异显著相关的目标代谢物[28,29]。图5A是LASSO模型的回归曲线图,其中下横坐标为惩罚系数的对数,上横坐标为此时模型中非零系数的个数,纵坐标为交叉验证的均方误差,在LASSO回归中,惩罚系数的取值范围通常从一系列较大的值逐渐减小,交叉验证的均方误差越小,表示模型的拟合效果越好。随着惩罚系数的变化,系数越晚被压缩为0的变量越重要。结果显示0.005为交叉验证的均方误差最小值,对应筛选得到5个最优特征。LASSO通过在模型估计过程中加入惩罚因子,能够提高模型的预测准确性,增强模型的泛化能力[30−31]。图5B中的每一条曲线代表每一个自变量系数的变化轨迹,纵坐标是系数的值,下横坐标为惩罚系数的对数,上横坐标为模型中非零系数的个数,对应筛选出这5个回归系数非零的具有预测作用的特征分别是:ophiogenin 3-O-beta-L-rhamnopyranosyl-beta-D-glucopyranoside、天冬氨酸、表儿茶素没食子酸酯、夏佛塔苷以及没食子儿茶素,这些差异代谢物可能成为这两种山药的关键差异标志物。
进一步以筛选出来的5个代谢物为指标,构建出LASSO回归模型,如下:
模型得分=3.1627+log2(ophiogenin 3-O-beta-L-rhamnopyranosyl-beta-D-glucopyranosid丰度值)×(0.1860)+log2(天冬氨酸丰度值)×0.0405+log2(表儿茶素没食子酸酯丰度值)×0.0009+log2(夏佛塔苷丰度值)×(−0.1469)+log2(没食子儿茶素丰度值)×(−0.4178)
通过模型得分对所有样本进行详细的分类评估,当得分小于0时,该样本归类为铁棍山药;当得分大于0时,该样本归类为太谷山药(图5C)。利用ROC曲线对所构建机器学习模型的性能进行评估,结果显示该曲线的AUC值显示为1,表明该模型准确性和可靠性良好,能够用于太谷山药和铁棍山药品种的鉴别(图5D)。
3. 结论
本研究利用UPLC-Q-TOF-MS/MS技术鉴定了太古山药和铁棍山药中的206种代谢物,通过代谢组学方法,筛选出太谷山药与铁棍山药之间的56种差异性代谢物,利用机器学习中的 LASSO回归算法确定了ophiogenin 3-O-beta-L-rhamnopyranosyl-beta-D-glucopyranoside、天冬氨酸、表儿茶素没食子酸酯、夏佛塔苷以及没食子儿茶素等五种关键代谢物能够作为区分这两个山药品种的标志物,建立了一个高度可靠的太谷山药和铁棍山药品种预测模型。该研究首次尝试将代谢组学与机器学习技术相结合,分析不同品种山药的差异标志物,开发了代谢组学结合LASSO回归机器学习算法识别太谷山药和铁棍山药中的差异标志物的新策略。研究不仅为山药产业的品质控制与品种溯源提供了新的思路,也为食品品种和营养成分的智能识别提供了新的方法。
-
表 1 太古山药和铁棍山药中差异代谢物相关信息
Table 1 Information on differential metabolites between Taigu yam and Tiegun yam
分类 化合物名称 加合方式 MS MS/MS 分子式 保留时间(min) log2FC VIP 黄酮类 5,7-dihydroxy-2-(4-hydroxyphenyl)-6,8-bis (3,4,5-trihydroxyoxan-2-yl) chromen-4-one [M+H]+ 535.1450 415.1032,295.0601 C25H26O13 5.75 −11.88 3.20 异鼠李素 [M-H]− 315.0513 300.0273,164.0115 C16H12O7 8.69 −10.05 2.81 香橙素 [M-H]+ 287.0565 165.0181,153.0186 C15H12O6 7.03 −2.62 1.49 异鼠李素-3-O-葡萄糖苷 [M-H]− 477.1047 315.0496 C22H22O12 6.44 −1.81 1.22 L-谷胱甘肽 [M-H]− 611.1458 306.1223,272.114 C20H32N6O12S2 6.22 −1.51 1.11 夏佛塔苷 [M-H]− 563.1408 473.1080,365.0659 C26H28O14 5.17 −1.45 1.08 牡荆素鼠李糖苷 [M-H]− 577.1565 431.0972,311.0543 C27H30O14 5.84 −1.45 1.08 氧杂蒽酮I [M+H]+ 409.1649 353.1018,335.0910 C24H24O6 1.13 −1.24 1.01 异甘草素 [M-H]− 255.0664 119.0496,91.0491 C15H12O4 7.60 1.63 1.17 柳穿鱼黄素 [M-H]− 313.0720 298.0455,284.0261 C17H14O6 9.39 2.14 1.34 黄腐醇 [M-H]− 353.1377 233.0786,119.0514 C21H22O5 6.89 −1.74 1.21 川陈皮素 [M+H]+ 403.1400 387.1044,373.0962 C21H22O8 10.24 −1.63 1.12 异鼠李素-3-O-新橙皮苷 [M-H]− 623.1639 315.0513,299.0196 C28H32O16 6.13 3.54 1.55 Sophoraflavonoloside [M+H]+ 611.1604 448.1007,287.0551 C27H30O16 10.98 −1.52 1.10 酚酸类 没食子儿茶素 [M-H]− 305.0665 165.0191,137.0243 C15H14O7 4.26 −13.42 3.40 羟基酪醇 [M-H]− 153.0556 135.0449,121.0293 C8H10O3 3.96 −3.79 1.80 肉豆蔻木脂素 [M+Na]+ 397.1621 274.3724 C21H26O6 8.33 −1.82 1.23 原儿茶酸 [M-H]− 153.0194 135.0083,109.0312 C7H6O4 3.82 −1.67 1.14 3-Methoxy-4-hydroxyphenyl 6-O-(3,4,5-trihydroxybenzoyl)-beta-D-glucopyranoside [M+H]+ 455.1538 437.1078,315.0711 C20H22O12 5.32 −1.64 1.13 山药素 IV [M+H]+ 245.1177 151.0755,137.0599 C15H16O3 7.72 −1.59 1.15 山药素III [M+H]+ 245.1177 151.0755,123.0804 C15H16O3 9.78 −1.31 1.03 3'-O-甲基山药素III [M+H]+ 259.1333 241.1223 C16H18O3 11.21 2.56 1.45 迷迭香酸 [M-H]− 359.2090 197.0452,179.0345 C21H30NO4 6.58 6.59 2.01 迷迭香酚 [M-H]− 345.1707 301.1807 C20H26O5 6.83 7.83 1.70 原花青素B2 [M-H]− 577.1359 425.0715,271.0436 C30H26O12 4.44 3.41 1.69 (+)-儿茶素 [M-H]− 289.0720 151.0403,109.0241 C15H14O6 5.19 13.81 3.45 双没食子酸 [M-H]− 322.0325 303.0461,260.0312 C14H10O9 14.00 −1.25 1.0 脂质类 1-(9Z-Octadecenoyl)-sn-glycero-3-phosphoethanolamine [M+H]+ 480.3095 265.2523,44.0496 C23H46NO7P 12.83 −6.91 1.83 亚油酸 [M-H]− 279.2331 261.2219,59.0138 C18H32O2 14.29 −2.17 1.35 十八烷二酸 [M-H]− 313.2385 295.2279,269.2487 C18H34O4 11.27 −4.61 1.98 油酸 [M-H]− 281.2488 263.2072,111.0673 C18H34O2 15.18 −1.62 1.12 9,12,15-十八碳三烯酸 [M+H]+ 515.3215 497.3134,335.2583 C27H46O9 13.38 1.75 1.19 植物鞘氨醇 [M+H]+ 318.3005 300.2890,282.2786 C18H39NO3 11.39 4 1.36 氨基酸及衍生物 泛酸 [M-H]− 218.1032 146.0826,88.0412 C9H17NO5 3.57 −1.54 1.14 组氨酸 [M-H]− 154.0621 136.9023,93.1002 C6H9N3O2 1.18 4.11 1.87 α-氨基丁酸 [M+H]+ 104.0701 87.0442,69.0339 C4H9NO2 1.27 5 2.07 天冬氨酸 [M-H]− 132.0300 115.0027,88.0397 C4H7NO4 1.19 15.37 3.64 酯类 迷迭香酸甲酯 [M+H]+ 375.1290 315.0871,163.0392 C19H18O8 1.18 −5.92 1.73 β-谷甾醇乙酸酯 [M+H]+ 457.4027 415.3862,397.3840 C31H52O2 12.07 −1.45 1.09 表儿茶素没食子酸酯 [M+H]+ 443.0969 139.0389,123.0441 C22H18O10 4.01 3.37 1.70 亚油酸甲酯 [M+H]+ 295.2633 263.2369 C19H34O2 16.05 1.32 1.04 有机酸类 对羟基肉桂酸 [M-H]− 163.0402 145.0296,119.0501 C9H8O3 7.27 8.89 2.42 对羟基苯甲酸 [M-H]− 137.0244 93.0343,59.1023 C7H6O3 7.15 9.22 2.69 异柠檬酸 [M-H]− 191.0193 147.0295,117.0188 C6H8O7 1.78 −3.65 1.77 核苷类 5'-甲硫腺苷 [M+H]+ 298.0967 163.0398,136.0596 C11H15N5O3S 4.33 −2.68 1.08 鸟苷 [M-H]− 282.0846 150.0423 C10H13N5O5 1.90 5 2.07 酰胺类 N-反式-对香豆酰酪胺 [M+H]+ 284.1283 147.0441,138.0915 C17H17NO3 7.32 1.73 1.17 N-油酰乙醇胺 [M+H]+ 326.3063 62.0596 C20H39NO2 14.06 2.22 1.30 香豆素类 氧化前胡素 [M-H]− 285.0774 201.0186,157.0290 C16H14O5 7.87 1.62 1.17 维生素类 维生素D2 [M+H]+ 397.3464 99.5113 C28H44O 16.54 −1.49 1.11 萜类 Ophiogenin 3-O-beta-L-rhamno pyranosyl-beta-D-glucopyranoside [M+H]+ 755.4238 575.3577,429.3013 C39H62O14 6.12 14.08 3.48 其他类 3-吲哚甲醛 [M+H]+ 146.0597 118.0690,91.0560 C9H7NO 4.15 −4.72 1.51 2,7-二羟基-4-甲氧基-9,10-二氢菲 [M+H]+ 243.0653 228.0782,211.0745 C14H10O4 6.52 −1.64 1.15 1,8-二羟基蒽醌 [M-H]− 239.0352 211.0401,167.0502 C14H8O4 9.86 −1.57 1.15 4-[3,5-dihydroxy-7-(4-hydroxyphenyl) heptyl]benzene-1,2-diol [M+H]+ 333.1704 107.0491 C19H24O5 6.24 −4.21 1.90 尿囊素 [M-H]− 157.0367 114.0302,71.0242 C4H6N4O3 1.30 2.14 1.34 -
[1] 孟世龙, 焦芬, 何青云, 等. 沙土与垆土种植怀山药中淀粉及抗性淀粉的理化性质比较[J]. 现代食品,2023,29(12):209−212. [MENG S L, JIAO F, HE Q Y, et al. Comparison of physicochemical properties of starch and resistant starch in chinese yam planted in sandy soil and loamy soil[J]. Modern Food,2023,29(12):209−212.] MENG S L, JIAO F, HE Q Y, et al. Comparison of physicochemical properties of starch and resistant starch in chinese yam planted in sandy soil and loamy soil[J]. Modern Food, 2023, 29(12): 209−212.
[2] 杨建宇, 李少荣, 李杨, 等. 道地药材怀山药的研究近况[J]. 光明中医,2020,35(11):1764−1767. [YANG J Y, LI S R, LI Y, et al. Recent research situation of genuine regional drug dioscorea opposita[J]. Guangming Journal of Chinese Medicine,2020,35(11):1764−1767.] YANG J Y, LI S R, LI Y, et al. Recent research situation of genuine regional drug dioscorea opposita[J]. Guangming Journal of Chinese Medicine, 2020, 35(11): 1764−1767.
[3] LEE S C, TSAI C C, CHEN J C, et al. The evaluation of reno- and hepatoprotective effects of huai-shan-yao (Rhizome Dioscoreae)[J]. American Journal of Chinese Medicine, 2002, 30(4):609-616.
[4] 林晓丽, 郎凯曈, 郑宝东, 等. 山药营养功能特性及其产品开发现状[J]. 食品与发酵工业,2023,49(6):339−346. [LIN X L, LANG K T, ZHENG B D, et al. Research progress of functional properties of Chinese yam and development of its products[J]. Food and Fermentation Industries,2023,49(6):339−346.] LIN X L, LANG K T, ZHENG B D, et al. Research progress of functional properties of Chinese yam and development of its products[J]. Food and Fermentation Industries, 2023, 49(6): 339−346.
[5] AN L, YUAN Y L, MA J W, et al. NMR-based metabolomics approach to investigate the distribution characteristics of metabolites in Dioscorea opposita Thunb. cv. Tiegun[J]. Food Chemistry,2019(298):125063.
[6] 马蕊, 杨珂, 李文辉, 等. 不同生长期怀山药化学成分分析[J]. 食品研究与开发,2019,40(13):84−92. [MA R, YANG K, LI W H, et al. Analysis of chemical constituents of huai yam in different growth period[J]. Food Research and Development,2019,40(13):84−92.] MA R, YANG K, LI W H, et al. Analysis of chemical constituents of huai yam in different growth period[J]. Food Research and Development, 2019, 40(13): 84−92.
[7] 李敏. 山药活性成分提取技术及药理功能的研究进展[J]. 南方农业学报,2013,44(7):1184−1190. [LI M. Overview on extractive techniques and pharmacological functions of active constituents in Rhizoma dioscoreae[J]. Journal of Southern Agriculture,2013,44(7):1184−1190.] LI M. Overview on extractive techniques and pharmacological functions of active constituents in Rhizoma dioscoreae[J]. Journal of Southern Agriculture, 2013, 44(7): 1184−1190.
[8] 陈梦雨, 刘伟, 侴桂新, 等. 山药化学成分与药理活性研究进展[J]. 中医药学报,2020,48(2):62−66. [CHEN M Y, LIU W, YU G X, et al. research progress on chemical constituents and pharmacological activities of Dioscorea opposita Thunb J]. Acta Chinese Medicine and Pharmacology,2020,48(2):62−66.
[9] 潘景芝, 孟庆龙, 崔文玉, 等. 山药功能性成分及药理作用研究进展[J]. 食品工业科技,2023,44(1):420−428. [PAN J Z, MENG Q L, CUI W Y, et al. Advances in studies on functional components and pharmacological effects of Dioscorea opposita thunb[J]. Science and Technology of Food Industry,2023,44(1):420−428.] PAN J Z, MENG Q L, CUI W Y, et al. Advances in studies on functional components and pharmacological effects of Dioscorea opposita thunb[J]. Science and Technology of Food Industry, 2023, 44(1): 420−428.
[10] 腊贵晓, 理向阳, 郭红霞, 等. 铁棍山药和太谷山药代谢成分差异研究[J]. 河南农业科学,2017,46(5):116−119. [LA G X, LI X Y, GUO H X, et al. Research on metabolomics of Tiegun yam and Taigu yam[J]. Journal of Henan Agricultural Sciences,2017,46(5):116−119.] LA G X, LI X Y, GUO H X, et al. Research on metabolomics of Tiegun yam and Taigu yam[J]. Journal of Henan Agricultural Sciences, 2017, 46(5): 116−119.
[11] PENG C, REN Y, YE Z, et al. A comparative UHPLC-Q/TOF-MS-based metabolomics approach coupled with machine learning algorithms to differentiate Keemun black teas from narrow-geographic origins[J]. Food Research International,2022,158:111512. doi: 10.1016/j.foodres.2022.111512
[12] 卢丽, 周承哲, 徐凯, 等. 基于感官评价和代谢组学的叶用枸杞茶分析[J]. 食品科学,2024,45(7):191−201. [[LU L, ZHOU C Z, XU K, et al. Analysis of leaf utilization wolfberry tea based on sensory evaluation and metabolomics[J]. Food Science,2024,45(7):191−201.] [LU L, ZHOU C Z, XU K, et al. Analysis of leaf utilization wolfberry tea based on sensory evaluation and metabolomics[J]. Food Science, 2024, 45(7): 191−201.
[13] PENG, Y F, CHAO Z, SHUANG G, et al. Metabolomics integrated with machine learning to discriminate the geographic origin of Rougui Wuyi rock tea[J]. npj Science of Food,2023,7(1):7. doi: 10.1038/s41538-023-00187-1
[14] RIVERA-PÉREZ A, ROMERO-GONZÁLEZ R, FRENICH A G. Application of an innovative metabolomics approach to discriminate geographical origin and processing of black pepper by untargeted UHPLC-Q-Orbitrap-HRMS analysis and mid-level data fusion[J]. Food Research International,2021,150:110722. doi: 10.1016/j.foodres.2021.110722
[15] GARCÍA-PÉREZ P, ZHANG L, MIRAS-MORENO B, et al. The combination of untargeted metabolomics and machine learning predicts the biosynthesis of phenolic compounds in Bryophyllum medicinal plants (genus Kalanchoe)[J]. Plants,2021,10(11):2430. doi: 10.3390/plants10112430
[16] DELAFIORI J, NAVARRO L C, SICILIANO R F, et al. Covid-19 automated diagnosis and risk assessment through metabolomics and machine learning[J]. Analytical Chemistry,2021,93(4):2471−2479. doi: 10.1021/acs.analchem.0c04497
[17] GALAL A, TALAL M, MOUSTAFA A. Applications of machine learning in metabolomics:Disease modeling and classification[J]. Frontiers in Genetics,2022,13:1017340. doi: 10.3389/fgene.2022.1017340
[18] SAITO M, KONDO N, YAMAGUCHI H, et al. Plant growth-regulating activities of batatasin III analogues[J]. Plant and Cell Physiology,1976,17(3):411−416.
[19] HASHIMOTO T, TAJIMA M. Structures and synthesis of the growth inhibitors batatasins IV and V, and their physiological activities[J]. Phytochemistry,1978,17(7):1179−1184. doi: 10.1016/S0031-9422(00)94310-3
[20] 薛慧, 王加茹, 徐宛婷, 等. 异甘草素抗肿瘤药理作用机制的研究[J]. 农产品加工,2019(14):69−70. [XUE H, WANG J R, XU W T, et al. The mechanism of antitumor pharmacological effects of isoliquiritigenin[J]. Farm Products Processing,2019(14):69−70.] XUE H, WANG J R, XU W T, et al. The mechanism of antitumor pharmacological effects of isoliquiritigenin[J]. Farm Products Processing, 2019(14): 69−70.
[21] 赵维维, 雷涛. 柳穿鱼黄素研究进展[J]. 中药材,2023,46(3):796−799. [ZHAO W W, LEI T. Research progress on pectolinarigenin[J]. Journal of Chinese Medicinal Materials,2023,46(3):796−799.] ZHAO W W, LEI T. Research progress on pectolinarigenin[J]. Journal of Chinese Medicinal Materials, 2023, 46(3): 796−799.
[22] 曾露, 张帅, 艾静. 表没食子儿茶素没食子酸酯研究进展[J]. 神经药理学报,2021,11(1):38−59. [ZENG L, ZHANG S, AI J. Research progress of epigallocatechin gallate[J]. Journal of Hebei North University(Medical Edition),2021,11(1):38−59.] ZENG L, ZHANG S, AI J. Research progress of epigallocatechin gallate[J]. Journal of Hebei North University(Medical Edition), 2021, 11(1): 38−59.
[23] ZHANG S, MAO B, CUI S, et al. Absorption, metabolism, bioactivity, and biotransformation of epigallocatechin gallate[J]. Critical Reviews in Food Science and Nutrition, 2023:1−21.
[24] 刘超, 陈若芸. 儿茶素及其类似物的化学和生物活性研究进展[J]. 中国中药杂志,2004,29(10):1017−1021. [LIU C, CHEN R Y. Advance of chemistry and bioactivities of catechin and its analogues[J]. China Journal of Chinese Materia Medica,2004,29(10):1017−1021.] LIU C, CHEN R Y. Advance of chemistry and bioactivities of catechin and its analogues[J]. China Journal of Chinese Materia Medica, 2004, 29(10): 1017−1021.
[25] 彭常安, 陶福基, 李鑫, 等. 尿囊素应用研究进展[J]. 安徽农学通报,2021,27(22):114−115. [PENG C A, TAO F J, LI X, et al. Research progress on the application of allantoin[J]. Anhui Agricultural Science Bulletin,2021,27(22):114−115.] PENG C A, TAO F J, LI X, et al. Research progress on the application of allantoin[J]. Anhui Agricultural Science Bulletin, 2021, 27(22): 114−115.
[26] 樊靓, 汤尚文, 余海忠, 等. 山药中尿囊素研究进展[J]. 现代农业科技,2015(3):308−317. [FAN J, TANG S W, YU H Z, et al. Research progress on allantoin in yam[J]. Modern Agricultural Science and Technology,2015(3):308−317.] FAN J, TANG S W, YU H Z, et al. Research progress on allantoin in yam[J]. Modern Agricultural Science and Technology, 2015(3): 308−317.
[27] 尹倩薇, 涂沛楠, 谢保城. 尿囊素的药理作用机制研究进展[J]. 现代药物与临床,2022,37(12):2897−2901. [YIN Q W, TU P N, XIE B C. Research progress on mechanism of allantoin[J]. Drugs & Clinic,2022,37(12):2897−2901.] YIN Q W, TU P N, XIE B C. Research progress on mechanism of allantoin[J]. Drugs & Clinic, 2022, 37(12): 2897−2901.
[28] 崔鸿雁, 徐帅, 张利锋, 等. 机器学习中的特征选择方法研究及展望[J]. 北京邮电大学学报,2018,41(1):1−12. [CUI H Y, XU S, ZHANG L F, et al. The key techniques and future vision of feature selection in machine learning[J]. Journal of Beijing University of Posts and Telecommunications,2018,41(1):1−12.] CUI H Y, XU S, ZHANG L F, et al. The key techniques and future vision of feature selection in machine learning[J]. Journal of Beijing University of Posts and Telecommunications, 2018, 41(1): 1−12.
[29] 闫慈, 田翔华, 阿拉依·阿汗, 等. 基于Lasso特征选择的代谢综合征数据分类[J]. 公共卫生与预防医学,2017,28(6):31−33. [Yan C, TIAN X H, ALAYI·Ahan, et al. Classification ofmetabolic syndrome based on Lasso feature selection[J]. Journal of Public Health and Preventive Medicine,2017,28(6):31−33.] Yan C, TIAN X H, ALAYI·Ahan, et al. Classification ofmetabolic syndrome based on Lasso feature selection[J]. Journal of Public Health and Preventive Medicine, 2017, 28(6): 31−33.
[30] 张沥今, 魏夏琰, 陆嘉琦, 等. Lasso 回归:从解释到预测[J]. 心理科学进展, 2020, 28(10):1777. [ZHANG L J, WEI X Y, LU J Q, et al. Lasso regression:From explanation to prediction[J]. Advances in Psychological Science, 2020, 28(10):1777-1788.] ZHANG L J, WEI X Y, LU J Q, et al. Lasso regression: From explanation to prediction[J]. Advances in Psychological Science, 2020, 28(10): 1777-1788.
[31] 黄茜, 郑少燕, 张志英, 等. 基于Lasso回归构建生物标志物影响代谢综合征的风险预测模型[J]. 中国疗养医学,2024,33(1):1−5. [HUANG X, ZHENG S Y, ZHANG Z Y, et al. A prediction model involving biomarkers for the risk of metabolic syndrome using the Lasso regression[J]. Chinese Journal of Convalescent Medicine,2024,33(1):1−5.] HUANG X, ZHENG S Y, ZHANG Z Y, et al. A prediction model involving biomarkers for the risk of metabolic syndrome using the Lasso regression[J]. Chinese Journal of Convalescent Medicine, 2024, 33(1): 1−5.