摘要
为了更全面地对文本进行特征选择,提高文本特征选择的准确率,提出一种基于野草算法的文本特征选择方法,利用野草算法中子代个体按正态分布的方式分布于父代个体周围,在进化过程中通过动态调整子代个体正态分布的标准差,使算法在早期与中期充分保持种群多样性的优势,对文本进行比较全面的特征选择;在算法后期加强对优秀个体的特征选择,保证算法稳健地收敛到全局最优解,提高文本特征选择的准确率。实验结果表明,这种方法可以给予权重值低的词条进行特征选择的机会,并且保证权重值高的词条特征选择优势,从而提高文本特征选择的全面性和准确性。
In order to select text feature more comprehensively and improve the accuracy of the text feature selection, a new text feature selection method based on Invasive Weed Optimization (IWO) was proposed. The biggest advantage of IWO is that the offspring individuals are being randomly spread around their parents according to Gauss normal distribution, and the standard deviation of the random function is adjusted dynamically during the evolution process; thus, the algorithm explores new areas aggressively to maintain the diversity of the species in the early and middle iterations, and enhances the feature selection of the optimal individuals in final iteration. Such mechanism ensured the steady convergence of the algorithm to global optimal solution, and improved the accuracy of the text feature selection. The results of experiments indicate that this method can provide the entry of low weight value with feature selection opportunity, and ensure the feature selection advantage of the entry with high weight value, thereby enhancing the completeness and accuracy of the text feature selection.
出处
《计算机应用》
CSCD
北大核心
2012年第8期2245-2249,共5页
journal of Computer Applications
关键词
文本特征
特征选择
野草算法
text feature
feature selection
Invasive Weed Optimization (IWO)