摘要
为提高药物研发的效率,通常使用定量构效关系(QSAR)模型来预测化合物的生物活性,从而进行筛选和优化。目前,基于统计分析的QSAR随着变量急剧增多变得束手无策,同时预测精度还有提高的空间。基于此,本文提出了一种基于改进的PCA算法对变量进行降维,并利用改进的麻雀搜索算法优化BP神经网络(ISSA-BPNN),以此提高预测的精度。改进的PCA算法先基于Pearson、最大互信息系数(MIC)和随机森林(RF)的加权得分得到主要特征变量,再用PCA算法对原特征进行降维得到主要输入变量;ISSA-BPNN算法优化BPNN的权值和阈值,达到输出稳定和保证全局收敛。以乳腺癌治疗时,化合物对ERα的生物活性数据为例进行了训练和预测。结果表明:本文所提算法预测精度更高,为药物研发提供了一种有效方法。
In order to improve the efficiency of drug research and development, quantitative structure-activity relationship(QSAR) models are usually used to predict and analyze the bioactivity of compounds to screen and optimize compounds. At present, QSAR based on statistical analysis has become ineffective with the rapid increase of variables, and there is still room for improvement in prediction accuracy. Based on this, we propose an improved PCA algorithm in this paper to reduce the dimensionality of variables and an improved sparrow search algorithm to optimize the back propagation neural network(ISSA-BPNN) to improve the accuracy of prediction. The improved PCA algorithm is based on weighted scoring algorithm of Pearson, maximum information coefficient(MIC) and random forest(RF) to obtain the main feature variables. Then the PCA algorithm is used to reduce the dimensionality of the original features to get the main input variables. The ISSA-BPNN algorithm optimize the weights and thresholds of BPNN, achieving output stability and ensuring global convergence. Taking the biological activity data of compounds on ERα during breast cancer treatment as an example for training and prediction, compared with several other algorithm, the results show that the algorithm proposed in this article has higher prediction accuracy. It provides an effective method for drug research and development.
作者
陈强
王登文
铁治欣
洪亮
CHEN Qiang;WANG Dengwen;TIE Zhixin;HONG Liang(School of Information Science and Technology,Zhejiang Sci-Tech University,Hangzhou 310018,China;Keyi College of Zhejiang Sci-Tech University,Shaoxing Zhejiang 312369,China;College of Media Engineering,Communication University of Zhejiang,Hangzhou 310018,China)
出处
《智能计算机与应用》
2022年第7期84-89,共6页
Intelligent Computer and Applications
基金
国家自然科学基金(61671407)。