摘要
乳腺癌与雌激素受体α亚型的表达密切相关。文章利用某次数学建模的数据,依次采用随机森林、灰色关联分析和逐步回归方法,对提供的拮抗剂信息进行分析,以筛选出在分子描述符中贡献度最高的前30个。最后,通过综合这三个模型的结果,同时对分子描述符之间的相关性进行检验,筛选出对生物活性值影响最大的前20个分子描述符,从而达到筛选治疗药物的目标。
Breast cancer is closely related to the expression of estrogen receptorαsubtype,and the top 30 contributors were selected among the molecular descriptors by random forest,gray association analysis and stepwise regression to provide antagonist information using data from a certain number of quantitative modeling.Finally,the three models are synthesized through the voting mechanism,and the correlation test between molecular descriptors is carried out to screen out the top 20 molecular descriptors with the highest impact on biological activity,so as to achieve the purpose of screening out therapeutic drugs.
作者
曾巧凤
ZENG Qiaofeng(School of Sciences,Southwest Petroleum University,Chengdu 610599,China)
出处
《计算机应用文摘》
2023年第21期95-97,共3页
Chinese Journal of Computer Application
关键词
特征筛选
随机森林
灰色关联分析
逐步回归
feature screening
random forest
grey association analysis
stepwise regression