摘要
目的提高黄檀属的物种鉴别成功率,并将机器学习方法与传统的基于距离/系统发育树的方法进行比较,筛选最优的ITS条形码分析方法。方法所使用的黄檀属物种ITS序列来自实验获得的3条以及从NCBI下载的399条共96个物种。以条形码ITS作为分子标记,对比距离法、系统发育树法及机器学习方法在黄檀属物种的鉴别成功率。结果在基于机器学习方法的分析中,黄檀属物种的平均鉴别成功率为39.59%,其中BLOG能识别出42个黄檀属物种,其正确序列分类占比为95.75%。另外,SMO、Naïve Bayes、JRip、J48能够识别出34个物种,分别获得了79.10%、58.71%、72.64%、76.37%的正确序列分类占比。基于系统发育树法与距离法的分析分别获得28.13%和36.46%的鉴别成功率。结论基于机器学习的黄檀属ITS条形码基原识别比距离法/系统发育树法拥有更高的鉴别成功率和社会经济效率。建议优先利用基于ITS条形码的机器学习方法对黄檀属物种进行基原识别。
Objective To improve the identification success rate of Dalbergia and screen out the best ITS analysis methods,compare the machine learning methods with the traditional distance-based and phylogenetic tree-based methods to screen the optimal ITS barcode analysis method.Methods A total of 402 ITS sequences of Dalbergia species used in this study were collected by experiments(three ITS sequences)and downloaded from NCBI(399 ITS sequences)for a total of 96 species.The barcode ITS was used as a molecular marker to compare the success rate of distance method,phylogenetic tree method and machine learning method in the identification of Dalbergia species.Results In the analysis based on machine learning methods,the average identification success rate of Dalbergia species was 39.59%,of which 42 Dalbergia species could be recognized by BLOG,and the percentage of their correct sequence classification was 95.75%.In addition,SMO,Naïve Bayes,JRip and J48 can identify 34 species with the correct sequence distribution rate of 79.10%,58.71%,72.64%and 76.37%,respectively.The distance-based and phylogenetic tree-based methods obtained the species identification success rate of 36.46%and 28.13%,respectively.Conclusion ITS barcoding identification of Dalbergia based on machine learning approaches has higher identification success rate and socio-economic than traditional methods.It is recommended to prioritize the use of machine learning approaches to identify Dalbergia based on ITS barcode.
作者
邝家荣
刘巧珍
代江鹏
谭智杰
林月霞
高晓霞
朱爽
KUANG Jiarong;LIU Qiaozhen;DAI Jiangpeng;TAN Zhijie;LIN Yuexia;GAO Xiaoxia;ZHU Shuang(School of Life Sciences and Biopharmaceutics,Guangdong Pharmaceutical University,Guangzhou 510006,China;School of Pharmacy,Guangdong Pharmaceutical University,Guangzhou 510006,China)
出处
《中草药》
CAS
CSCD
北大核心
2024年第11期3825-3834,共10页
Chinese Traditional and Herbal Drugs
基金
广东省基础与应用基础研究基金自然科学基金面上项目(2022A1515011268)。